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SYSTEM AND METHOD FOR AUTOMATIC COT OR 

SEGMENTATION AND MINIMUM SIGNIFICANT 
RESPONSE FOR MEASUREMENT OF FRACTIONAL 
LOCALIZED INTENSITY OF CELLULAR COMPARTMENTS 

5 

Related Applications 

This application claims priority from U.S. Provisional Patent Application 

60/363,839 filed March 13, 2002. 

10 

TECHNICAL FIELD 
This application concerns assay of biological material by means of high 
speed, high througjiput cytometry using rractionalized localized intensities of 
subcellular components in magnified images. 

15 

BACKGROUND ART 
Drug discovery screening has historically used simple well-based read- 
outs in order to handle a high throughput of lead compounds. However, a given 
assay currently provides only the information that a drug affects some of the 

20 cellular processes that result in the response measured; the exact nature of the 
target for Ihe drug is not indicated. A cell-based assay is a model system designed 
to identify compounds that interact with a selected molecular target in a specific 
manner. Cell-based assays are robust in that they approximate physiological 
conditions, and they can yield highly complex information. This requires 

25 sophisticated image analysis tools and streamlined data handling. Multi-parameter 
cell assays, where a response is measured by multiplexed reporter molecules, as 
well as morphological criteria, have been limited by the labor-intensive nature of 
imaging and analyzing subcellular details. The power of obtaining more complex 
information earlier in the screening process demands effective solutions to this 

30 bottleneck. 

Dissecting the steps in cellular pathways is important, because multiple 
pathways converge and diverge to provide redundancy (backup in case of cellular 
dysregulation) and as a coordinated response. Whereas a given drug may result in, 
e.g., the secretion of a cytokine (such as Interleukin 2 (EL-2) measured as a single 
35 parameter response in a well plate) the researcher does not know which signaling 
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pathway was utilized, or what other cellular responses were initiated. If the 
signaling pathway used also led to cell death, the efficacy of the candidate drug 
would be compromised and would fail in costly and controversial animal testing. 
Multiplexed cellular responses need to be investigated to eliminate this kind of 
5 false positive lead in drug discovery. 

The complexity of real cell responses also leads to heterogeneity between 
cells, even in a cloned cell line, depending on other factors such as cell cycle 
progression. Thus, a lead which acts on a cellular process which is part of DNA 
synthesis would elicit a response only in those cells which were in S phase at the 

10 time the drug was added. In the clinical situation, continuous infusion can ensure 
all cells are treated, and so the lead is a viable candidate. If an average response 
from all the cells in a well is measured, it may fall below the threshold for 
detection and result in a false negative: an effective drug is overlooked. 

Pharmaceutical companies continually demand raster analysis of screening 

15 tests. Automation has addressed the need to increase data acquisition, but there 
remain stringent requirements for accuracy, specificity and sensitivity. 
Preliminary data indicates that the higher the information content of the assay 
read-out, the less variability there is between individual cells in a responding 
population. Thus, the total number of cells needed to attain the confidence level 

20 required by the experiment is decreased, resulting in increased throughput. More 
accurate analysis results in better dose response information. Higher quality data 
results in better data mining and identification of drugs with significant clinical 
impact. . 

Automated quantitative analysis of multiplexed fluorescent reporters in a 
25 population of intact cells at subcellular resolution is known. Accurate fluorescence 
quantification is made possible by technological advances owned by Q3DM, the 
assignee of this application, and the'rapid processing of this complex image data 
in turn depends on unique computational processes developed by Q3DM. High 
throughput microscopy has only recently become possible with new technological 
30 advances in autofocus, lamp stabilization, image segmentation and data 
management (e.g., US 5,548,661, 5,790,692, 5,790,710, 5,856,665, 5,932,872, 
5,995,143, and US patent applications 09/703,455 and 09/766,390). Accurate, 
high-speed autofocus has enabled fully automated "walk-away" scanning of 
arbitrarily large scan areas. Quantification is dramatically improved by controlling 
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the intensity of illumination, and is dependent on accurate autofocus. Advances in 
image segmentation make detection of bom dimly and brightly stained cells 
simple, so that heterogeneous cell populations can be assayed with statistically 
meaningful results. Finally, rapid streaming of high information content data 
5 depends on efficient image data format and caching. Together, these technological 
advances enable retrieval of high quality, image-based experimental results from 
multiplexed cell based assays. 

SUMMARY AND ADVANTAGES OF THE INVENTION 
10 The most comprehensive tool set that biology drug discovery could 

possess at this time would be one that could integrate technologies involving the 
acquisition of molecular, cell-structural, and cell-activity data. Application of such 
a tool set to Biology programs would transform current end-point cell-assays from 
indirect read-outs (such as IL2) to sub-cellular structural co-localization analyses, 
15 and time-lapse imaging of cell function activities. Quantitative cell imaging 
technologies will lead the way to the next level of rational drug screening and 
design. 

With the automation of this rate-limiting step in the discovery pipeline 
comes the possibility of designing high throughput cell-based assays mat occur in 

20 the context of the whole cell, as well as the ability to multiplex reporter molecules 
so that complex cellular interactions may be taken into account. These assays 
would have measurable economic advantages, because in obtaining more data 
earlier, they offer the possibility of eliminating false positives and false negatives 
due to heterogeneity in cellular response. The benefit to drug discovery is that lead 

25 compounds are better qualified before they enter costly and controversial animal 
trials. 

These improvements and advantages are provided by this invention which 
is realized as an assay system and method for automating color segmentation and 
m in imu m significant response in measurement of fractional localized intensity of 
30 cellular compartments. 

Assay development may yield unexpected results, therefore the invention 
provides the ability to visually inspect cells identified on a multi-parametric data 
plot Gating the cells of interest generates an immediate montage on a display 



3 



WO 03/078965 PCT/US03/07968 
screen, and by relocation of the scanning stage, the user can look through a 
microscope eyepiece for validation. 

New assay development demands that the assay system be adaptable to 
novel reporter molecules, to new cell types, and to the visual nature of the cellular 
5 response. To this end, the assay system of the invention uses object-oriented 
software framework to be open and extensible so that features such as hardware, 
measurement or database functions, and cell classification schemes can be added 
without having to uncover large portions of existing software. In particular, 
adaptive object recognition algorithms incorporated into the assay system of the 

10 invention, including image segmentation and tessellation, allow a user to 
interactively define what constitutes an object of interest in an assay (i.e. the 
nucleus of a given cell type). Implemented interactively, these algorithms are 
incorporated into scanning parameters for that assay, so that automated image 
segmentation and tessellation can enable the speeds necessary for high throughput 

15 screening. 

The accuracy of fluorescence measurements is enhanced due to high 
performance features, notably autofocus, lamp stabilization, image segmentation, 
image tessellation, and image measurement algorithms. The resulting increase in 
the signal-to-noise ratio of the data is evident in diminished residuals in a best-fit 
20 curve, providing enhanced sensitivity. 

BRIEF DESCRIPTION OF DRAWINGS 
Figure 1 is a block diagram of an automated microscopy system according 
to this invention. 

25 Figure 2 is a plot showing a surface of best foci in a 9x15 square 

millimeter area. 

Figure 3 is a diagram demonstrating a process using incremental scanning 
to obtain a best focus which may be used in the practice of this invention. 

Figures 4a, 4b, and 4c illustrate autofocus circuit dynamic range and 
30 sensitivity. 

Figure 5 is a block diagram of an autofocus measurement circuit which 
may be used in the practice of this invention. 

Figures 6a-6d, illustrate enhancement of contrast in magnified images by 
nonlinearly trained (perception criterion) 2D image filters. 
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Figure 7 is a magnified image of biological material showing tessellation 
of cell nuclei. 

Figure 8 is a plot showing results obtained using optical feedback 
stabilization of a 100 W Hg vapor arc lamp. 
5 Figure 9 is an illustration of an image table data structure which may be 

used in the practice of this invention. 

Figure 10 is an illustration of an image table caching algorithm which may 
be used in the practice of this invention. 

Figure 11 is a series of three panels of magnified cell images showing the 
10 effect of multichannel image acquisition and processing software components on a 
BIPI-prepared wellplate. 

Figure 12 includes four panels showing how image segmentation and 
tessellation may be used in the invention to identify and localize cell nuclei. 

Figure 13 illustrates the population statistics for TNFa-induced NFkB 
15 activation in HUVEC cells. 

Figure 14 is a family of curves representing the theoretical dose response 
resolution for or = {.02,.04,.06,.08 s .l,.12,.14} (ascending, a= 0.05 (Type I error 
probability), 0= 0.05 (Type H error probability)). 

Figure 15 shows a theoretical dose response curve with // raax = 1.0, /fciin = 
20 0.0, and/C5o = 8.895 xlO" 5 . 

Figure 16 is a family of plots showing a worst-case response for well-well 
mean FLIN response. 

Figure 17 is a family of plots showing a best-case response for well-well 
mean FUN response 

25 Figure 18 is a family of plots showing the results of a Monte Carlo 

simulation of dose response regression. 

Figures 19a and 19b are curves that show Monte Carlo confidence interval 
width versus sample concentrations. 

Figure 20 is a plot showing a translocation response of the G\ 
30 subpopulation according to the invention. 

Figures 21a and 21b illustrate inhibition of NFxfi nuclear translocation by 
BIPI compound A. 
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Figures 22a and 22b illustrate inhibition of NFkB nuclear translocation by 
BIPI compound B. 

Figures 23 a and 23b illustrate inhibition of NFkB nuclear translocation by 
BIPI compound C. 

5 Figure 24 illustrates inhibition of NFkB nuclear translocation by BIPI 

compound B in a defined G\ cell subpopulation 

Figure 25 is a family of curves produced by plotting time required to 
achieve a minimum significant response as a percent of total NFkB translocation 
on maximum stimulation per well. 
10 Figures 26a and 26b are illustrations of first and second graphical user 

interface (GUI) tools used to initialize a drug screen according to the invention. 

Figure 27 is an illustration of a third GUI tool used to initialize a drug 
screen according to this invention. 

Figure 28 is a flow diagram representing steps executed in initializing a 
15 drug screen. 

Figure 29 is a flow diagram representing a drug screen method performed 
by a programmed computer according to this invention and also representing a 
software program for causing the method to be performed. 

Figures 30a-30h are geometric figures including nodes, edges, circles, and 
20 polygons that illustrate various aspects of tessellation as used in this invention. 
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DETAILED DESCRIPTION 

Fractional Localized Intensity of Cellular Compa rtments (FLIP 

Development of multicompartment models for cellular translocation 

5 events: Many potentially important molecular targets are regulated not only in 

their expression levels, but also by their subcellular or spatial localization. In the 

post-genomics era, ttluminating the function of genes is rapidly generating new 

data and overthrowing old dogma. The prevailing picture of the cell is no longer a 

suspension of proteins, lipids and ions floating around inside a membrane bag, but 

10 involves protein complexes attached to architectural scaffolds or chromatin 
provided by the cytoskeleton, endoplasmic reticulum, Golgi apparatus, ion 
channels and membrane pores. Cell surface receptors are oriented within the 
plasma membrane such that they can bind an extracellular molecule and initiate a 
response in the cytoplasm. Protein complexes in the cytoplasm can be dissociated 

15 following regulated proteolysis to release specific DNA binding proteins. These 
proteins then pass through nuclear pores, interact with the chromatin organization, 
and regulate gene transcription. Proteins are then trafficked through the Golgi 
apparatus where they are readied for functionality. Any of these processes can be 
targets for improving clinical efficacy and reducing side effects, and as such are 

20 important to understand. 

A typical assay to screen for an agonist or antagonist for this process 
would measure the movement of a known DNA binding protein from the complex 
into the nucleus. However, by multiplexing reporter molecules, the same assay 
can also provide information on the receptor internalization. Thus the user can be 

25 sure of which receptor is responding to the drug when the downstream effect is 
mediated. Broadly speaking, mere is a need to visualize protein movement as a 
response to the activation of target signaling molecules, specifically receptors on 
the external membrane binding their ligand (or drug) and internalizing, as well as 
transcription factors moving from cytoplasm to nucleus. The definition of discrete 

30 compartments has been achieved by compartment specific labeling (dyes). The 
plasma membrane is labeled with a lipophilic carbocyanine dye, the cytoplasm 
with a dye that permeates live cells but is cleaved in the cytoplasm so that it 
cannot diffuse out and the nucleus by a membrane-permeant DNA intercalating 
agent 
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Round cells such as lymphocytes present a challenge in identifying 
nuclear, cytoplasmic, and membrane compartments. Resolving the sparse 
cytoplasmic compartment requires high resolution imaging, achieved using a high 
numerical aperture objective lens, but this also narrows the depth of field. There 
5 are many subcellular components, organelles or patterns that also require high- 
resolution images. Robust autofocus becomes necessary; this must be done rapidly 
to meet the demands of high throughput screening, and to ensure accurate 
quantification of fluorescence intensity. The invention employs an autofocus 
mechanism that is an order of magnitude faster than other systems available 

10 because it uses a dedicated circuit to measure image sharpness directly from the 
video signal. Such an autofocus mechanism is robust because it measures the 
change in the optical transfer function (OTF) in a range that avoids the contrast 
reversals inherent in cell images. 

The assays used for generating preliminary data will may include a second 

15 reporter molecule that detects a separate protein. Desirably, more parameters may 
be measured in the same cell over a given time period. By die use of an object- 
oriented software framework, the platform employed by this invention is 
extensible and algorithm-driven for maximum flexibility. To give a hypothetical 
case, a simple comparison of two spatial patterns (e.g., nuclear and cytoplasmic 

20 fluorescence) may be insufficient to determine the efficacy of a lead on a target. 
Even though the positive response occurs, if this is accompanied by changes in 
cell morphology or metabolism, the efficacy may be questionable, or toxicity may 
be a more important limitation. The addition of reporters for these cell properties 
will require additional image analysis algorithms that will classify a set of cellular 

25 responses. Preferably the invention utilizes software plugin architecture in order to 
meet these needs by supporting development of new, assay specific processing 
modules into the image analysis capabilities of the base platform and existing 
assays. 

A high throughput, microscopy platform utilized in the practice of the 
30 invention is illustrated in Figure 1. This platform includes a high performance, 
research-jgrade instrument built as an attachment system around a commercial 
fluorescence microscope. The platform includes a programmable digital (host) 
computer 1 10, preferably a high-end, off-the-shelf Pentium® workstation running 
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the Windows® operating system. Developed and validated assays may be run on 
this system using a straightforward, point and click operation based upon familiar 
Windows® motifs. Other components of this platform include a live monitor 1 15, 
an inverted Nikon TE300 fluorescence microscope 117 with attachments that 
5 include a robotic stage 118, a CCD (charge coupled device) camera 120, and an 
LED phase contrast light source 125. A stabilized mercury arc lamp mechanism 
131 provides Ulumination for operation of the microscope by way of fiber optic 
housing 132 and an optical fiber 133. A control unit 137 drives the stage 118 
while it supports a structure 139 (such as a microtiter plate or another equivalent 

10 assay device) disposed on the stage 118 in response to commands issued by the 
host computer 110 and host computer. An autofocus circuit 142 responds to 
images of obtained by the CCD camera 120, providing processed information to 
the host computer 110. A piezofocus mechanism 148 controls the positioning of 
the lens mechanism of the microscope 117 in order obtain a best focus. The host 

15 computer 110 is programmed to perform routines for focusing the microscope 
117, controlling the stabilized mercury arc lamp mechanism 131, scanning the 
structure 139 by moving the stage 118, intaking and storing sequences of 
magnified images of biological material on the structure 139 produced by the 
CCD 120, processing the magnified images by segmentation and tessellation, and 

20 performing the translocation processing described in more detail below. 

Autofocus is critical for scanning large areas due to the variations in slide 
and coverslip surfaces and mechanical focus instability of the microscope 
(particularly thermal expansion) [M Bravo-Zanoguera and JH Price. Analog 
autofocus circuit design for scanning microscopy. In Proceedings, International 

25 Society for Optical Engineering (SPIE), Optical Diagtiostics of Biological Fluids 
and Advanced Techniques in Analytical Cytology, volume 2982, pages 468-475, 
1997]. These effects combine to effectively create an uneven surface over which, 
the cells must be analyzed. An example of this uneven surface is plotted in Figure 
2, a 3D plot of a surface of best foci (9 x 15mm 2 area) demonstrating the need for 

30 autofocus. The plot represents the surface of best foci with the average plane 
subtracted and shows a range of 6um. [M Bravo-Zanoguera, B von Massenbach, 
A Kellner, and JH Price. High performance autofocus circuit for biological 
microscopy. Review of Scientific Instruments, 69(ll):3966-3977, 1998]. As can be 
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appreciated with reference to this figure, the foci can easily vary through a range 
of ±10fixn over an entire microscope slide. This can have a dramatic effect on the 
ability to locate cells. Using a Nikon Fluor 0.75 NA 20x objective, for example, 
the mean fluorescence intensity has been shown to drop by about 35% at ±10 urn 
5 from focus [JH Price and DA Gough. Comparison of digital autofocus functions 
for phase-contrast and fluorescence scanning microscopy. Cytometry, 16(4): 283- 
297, August 1994]. 

Figure 3 is diagram demonstrating incremental scanning. Best focus from 
the previous field provides the center of the autofocus search range for the next 

10 field. The degree of focus (sharpness) is then measured for several test planes 
(typically 5 - 9) at the video interlaced field rate of 60Hz, and the piezoelectric 
positioner is moved to best focus. Stage movement and autofocus require a total of 
0.3s. 

Incremental scanning is carried out by moving the stage to a field, 

15 stopping the stage, performing autofocus, acquiring the image and repeating on 
the next field. This sequence is shown in Figure 3. If focus lies too close to one 
extreme of the search range, the center of the range can be repositioned and 
autofocus repeated. 

Autofocus on single fields has been extensively explored and reviewed 

20 [Price & Gough]. Further work has extended the techniques to large numbers of 
microscope fields and performing high-speed autofocus with real time image 
processing [Price & Gough]. Careful attention to magnification and sampling led 
to about lOOnm precision (as measured by the standard deviation) in scans of 
thousands of microscope fields with focus achieved in 0.25s. Further development 

25 of this technology led to design and implementation of an autofocus circuit that 
tremendously reduced the cost of autofocus and improved focus precision to an 
average of 56 ran [Bravo-Zanoguera et al.]. Additional theoretical and 
experimental studies on the through-focus OTF helped further support the choice 
of the correct filter for use in isolating the high frequencies for focus measurement 

30 [MA Oliva, M Bravo-Zanoguera, and JH Price. Filtering out contrast reversals for 
microscopy autofocus. Applied Optics, 38(4):638-646, February 1999]. 
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Table 1: Autofocus performance in scanning 
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a. Focus time is (positions + Z)/60 s, or 0.18 s and 0.22 1 fox 9 and 1 1 test positions respectively 

b. Plane fit to data by linear regression and subtracted 
5 c Linear spacing between focus planes 

d. Digital Tracking 

e. Nonlinear spacing between focus planes of 17, 10, 7, 6, 6, 6, 6, 7, 10 and 17 digital units 

f. 48% overlap between contignou* fields 

g. Analog tracking 

10 b. Tracked by average of analog and digital 

1. Nonlinear spacing between focus planes of 22, 1 0, 7, 6, 6, 7, 1 0 and 22 digital units 



Figures 4a, 4b, and 4c illustrate autofocus circuit dynamic range and 
sensitivity demonstrated using (left) a microscope field with no cells and only a 

15 small amount of debris; (Center) shows the plot with autogain disabled (gain of 
1.0), and (right) shows the plot with an autogain of 100. 

The autofocus circuit illustrated in Figure 5 was tested under typical 
scanning conditions [M Bravo-Zanoguera, B von Massenbach, A Kellner, and JH 
Price. High performance autofocus circuit for biological microscopy. Review of 

20 Scientific Instruments, 69(1 1):3966-3977, 1998.]. The purpose of the scanning 
experiments was to determine the ability of the system to track focus, as well as 
measure precision and compare the analog circuit to the digital system. Seven 
rectangular areas, ranging from 1200 to 2500 fields, were scanned in a raster 
pattern. The results are shown in Table 1. For the first six experiments, focus was 

25 computed from the power-weighted average of the focus measurements from 11 
different vertical positions. Li the final experiment, this' number was reduced to 9 
vertical positions. Autofocus was performed 20 times at each microscope field. 
The combined standard deviation (a ) of all autofocus trials (20 fields) in each 
experiment is shown in Table 1 . For every experiment, the precision (as measured 

30 by a ) was better for the analog circuit than the digital system. The combined a s 
for all seven experiments were 0.056jrm (analog) and 0.087fim (digital). This is 
almost an OTder of magnitude better than the 0.528^im depth of field (according to 
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the Fran9on criterion [M Fran9on, editor. Progress in Microscopy. Row and 
Peterson, Evanston, IB, 1961]) and approaches the minimum vertical step of 
0.024|xm (PIFOC piezoelectric objective positioner, Polytec PI, Irvine, CA). 

Table 1 also includes two columns showing the range of best foci for each 
5 area scanned. In one column the raw data range is presented, while in the next 
column the range with the average plane subtracted is shown (nonplanar). The raw 
data range is 6.2 - 16.9pm and is largest for (he biggest area. The nonplanar range 
is 1.6 - 4.1jmn and is still much larger than the depth of field of high NA 
objectives. Other experiments (data not shown) indicated that the absolute 

10 variation from flat increases with area as expected. For example, data from larger 
10 x 15mm 2 scans revealed a 6pan range, and further experience scanning larger 
areas has led us to expect as much as a ±10pm range of foci over an entire slide. 
This range is even larger (as much as hundreds of microns) for the wellplate 
formats that dominate many industrial microscopy applications. 

15 The autofocus circuit of Figure 5 provides very high sensitivity. However, 

it became clear early in the design process that the analog sensitivity was much 
greater than available with subsequent 12 bit A/D conversion. Thus, an autogain 
circuit was added to augment the dynamic range. Figures 4a, 4b, and 4c 
demonstrate the advantage of autogain, and the dynamic range and sensitivity of 

20 the circuit on a microscope field containing only small debris. We have found that 
this circuit is so sensitive that it can track focus even when no cells are present as 
long as there is something, even if only minimal debris, in the field. 

It would be convenient if simple intensity thresholding were be adequate 
for quantitative cell-based assay of cells stained with relatively bright fluorescent 

25 dyes. Unfortunately, fluorescently stained cells invariably exhibit wide variations 
in stain because the cell size and fluorochrome densities vary. Accordingly, real- 
time image segmentation is utilized in this invention for fluorescently stained cell 
compartments in order to make subcellular compartment identification and 4 
localization much less dependent on fluorescence intensity or variability [JH 

30 Price, EA Hunter, and DA Gough. Accuracy of least squares designed spatial FIR 
filters for segmentation of images of fluorescence stained cell nuclei. Cytometry, 
25(4)303*-316, 1996]. This work is illustrated in Figures 6a-6d, wherein 
enhancement by nonlinearly trained (perceptron criterion) 2D image filters to 
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optimally approximate human texture classification capabilities is illustrated. In 
Figure 6a, original images of three DAPI stained 3T3 cell nuclei are shown. 
Figure 6b illustrates the result obtained with application of conventional image 
filters which do not provide adequate contrast enhancement. In Figure 6c, linearly 
5 designed 2D filters also do not provide adequate contrast enhancement Figure 6d 
illustrates that results obtained with nonlinearly designed filters provide dramatic 
contrast enhancement enabling simple intensity thresholding for a final 
segmentation step. This method utilizes nonlinear least-squares optimized image 
filters to create marked object-background contrast for automatic histogram-based 

10 thresholding. This contrast makes the final image segmentation much more 
threshold-independent, and allows dim cells to be segmented with the same 
accuracy as bright ones over a much larger intensity range than was previously 
possible. According to this method, the error function for filter design is allowed 
to sum only the error from a predetermined minimum contrast Allowing 

15 intensities to extend beyond this range, rather than requiring an exact match, 
substantially improved performance (see Figures 6a-6d). This method has been 
tested extensively over many years. For example, Price, Hunter and Gough tested 
ten montage images containing a combined 1,070 fluorescently stained cell nuclei. 
Each image was manually segmented to provide a standard. Then 3x3 through 25 

20 x 25 element 2D filters were trained using the perceptron criterion and nonlinear 
least squares to optimally approximate the human standard. Each filter was tested 
against the other 9 images in order to ensure that the image did not bias this 
supervised design technique; Very high segmentation accuracy was achieved 
using this method, and there was little or no dependence of accuracy on the design 

25 image. In addition, the improvement in accuracy began flattening at the 7 x 7 filter 
size and did not improve beyond the 15 x 15 filter size. General purpose CPUs 
have advanced in processing power so that it is now possible to implement online 
image processing using 7x7 filters without any additional hardware. The fact that 
large filters are not required is advantageous for real-time operation. 

30 This image segmentation mechanism provides the basis of fully 

automated, real-time cytometry from the in-focus image stream. The advantages 
of the precision and accuracy of this method result in improved measurement 
fidelity, improving system throughput and resolution as explored below. 
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Segmentation of objects from background using least-squares-designed 
contrast-enhancing filters can be used to find any cellular compartment or set of 
compartments from a set of fluorescence images. Tessellation can then be used to 
assign each cellular compartment to a cell. Image segmentation of the cell nuclei 
5 creates a single object for each cell. The nuclear masks are then fed to the 
tessellation algorithm to map out regions belonging to each cell and the remaining 
compartments are assigned to a cell based on those regions. Thus, tessellation 
provides an objective means to assign cellular compartments to ceils. 

Image segmentation of objects from background also does not separate 

10 overlapping objects. Even in carefully controlled cell cultures, cells may overlap. 
In many cytometry applications, measurements can be improved by cutting the - 
overlapping objects apart Tessellation can be used to separate overlapping 
objects. Once the images of the cell nuclei have been segmented using contrast 
enhancing filters, the nuclear positions (e.g., centroids) can be input to the 

15 tessellation algorithm to cut the images of other overlapping cellular 
compartments apart and improve measurement fidelity. Mathematical morphology 
techniques (erosion and dilation) and the watershed algorithm are also often used 
to separate overlapping objects. For example, a cell membrane stain (e.g., Dil or 
DiD, Molecular Probes, Eugene OR) or an amine whole cell stain (Alexa 488, 

20 Molecular Probes, Eugene OR) can be used to identify the cytoplasmic and 
nuclear compartments together (the cytoplasmic compartment can be determined 
by subtracting the nuclear compartment). Erosion/dilation or watershed can then 
be used to separate the overlapping cells. Other cellular compartments (e.g., 
endoplasmic reticulum, ER) or vesicles can then be assigned to the cells based on 

25 the resulting image segmentation masks. However, very thin cellular regions of 
the cell that are very dim may not be segmented and may be absent from the 
resulting masks. Cell compartments such as ER or vesicles that fall on the mis sing 
areas will not be assigned to a cell. Tessellation can then be used to complete the 
assignment It also may be inconvenient or very difficult to add a stain to identify 

30 the cytoplasm. In the absence of a cellular mask, tessellation can be used to assign 
cellular compartments to cells (nuclear masks). 

in this invention, once the magnified images of the cell nuclei have been 
segmented using contrast enhancing filters, tessellation of the segmented image is 
utilized to precisely define nuclear positions (e.g., centroids) in order to cut the 
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images of other overlapping cellular compartments apart and improve 
performance. Tessellation, according to the unabridged, on line Merriam-Webster 
Dictionary is "a covering of an infinite geometric plane without gaps or overlaps 
by congruent plane figures of one type or a few types'*. 
5 Essentially, tessellation of a magnified image in this description concerns 

the formation of a mosaic or a mesh of plane figures on a magnified image of cells 
in which each plane figure of the mosaic or mesh contains one object or 
compartment within a cell. Such objects or compartments include, without 
limitation, nuclei, membranes, endoplasmic reticuli, Golgi, mitochondria, and 

10 other cellular regions having protein concentrations that are organized in some 
way. Further, such magnified images are processed by segmentation to distinguish 
identical cell components, such as nuclei, from the background and all other cell 
components. The processed, segmented image is then tessellated to separate each 
of the distinguished cell components from nearby and overlapping neighbors. For 

15 example, refer to Figure 7 in which segmented cell nuclei are separated by a 
tessellation mesh of plane figures wherein each plane figure in the contains one 
nucleus. In this figure, for example, segmented nucleus 700 contained within the 
polygon 710 is separated and therefore distinguishable from its near neighbor 
nucleus 720 which is contained within polygon 730. Knowing the location, shape, 

20 and dimensions of each plane figure in the mosaic, the nuclei in the magnified 
image can be quickly accessed by traversing the mosaic in a regular fashion. The 
mesh in Figure 7 is visible only in order to support an understanding of 
tessellation. In practice, the mesh resulting from tessellation can be visualized 
using conventional image processing, manual entry and display means. In 

25 processing of magnified images by programmed means, the mesh would, of 
course, also exist as one or more stored data structures. These data structures 
enable automation of tessellation by programming of the host computer 110. 

The platform of Figure 1 also includes a stable iUumination source for 
epifluorescence. Light source stabilization can be critical in image cytometry 

30 systems. . While many other instruments (e.g., spectrophotometers and flow 
cytometers) are able to overcome this problem by monitoring and correcting 
source fluctuations, image cytometry is different because it is difficult to monitor 
and correct for the 250,000 or more detectors in a CCD camera. Accordingly, the 
source is constituted of a stable 100W Hg vapor light source employing an optical 
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fiber to scramble out the spatial variations (from arc wander), and feedback 
control to eliminate the remaining temporal instabilities [S Heynen, DA Gough, 
and JH Price. Optically stabilized mercury vapor short arc lamp as uv-light source 
for microscopy. In Proceedings, International Society for Optical Engineering 
5 (SPIE), Optical Diagnostics of Biological Fluids and Advanced Techniques in 
Analytical Cytology, volume 2982, pages 430-434, 1997]. The illumination 
stability provided is shown in Figure 8. The gray trace is of an unstabilized lamp 
and the dark trace shows the result of feedback stabilization with our system. Bom 
traces are envelopes showing the minimum and maximum intensity values 

10 recorded at 10 s intervals. The intensities were measured at 30Hz using an 
Imaging Technology, Inc. Series 151 image processor. This long-term stability is 
exceptional. With feedback, the intensity coefficient of variation was reduced by 
over 30 times. This intensity control will greatly improve measurements that 
depend directly on the fluorescence intensity, such as in cell functional assays. 

15 The invention further utilizes image area-of-interest (AOI) data structures 

and caching algorithms. One aspect of scanning cytometry is that the inherent 
view of the data is a contiguous image of the slide sample, composed of tens of 
thousands of microscope fields and gigabytes of raw image data (typically larger 
than the conventional 32-bit addressable space). Efficient access to AOIs in these 

20 data is required by both front-end users through the graphical interface, and by 
application programmers and assay developers. This access can only be provided 
by novel image data structures and caching algorithms that organize disk-resident 
AOIs in the scanned sample, and shuffle them in and out of RAM as required. An 
"image table" data structure has been developed [EA Hunter, WS Callaway, and 

25 JH Price. A software framework for scanning cytometry. In Proceedings, 
International Society for Optical Engineering (SPIE), Optical Techniques in 
Analytical Cytology IV, San Jose, CA, volume 3924, pages 22-28, January 2000.] 
to organize and manage image data for efficient access by the most prevalent use 
scenarios, including continuous (user scrollable), discontinuous (arbitrary 

30 rectangle), and database driven (query response) sample browsing and queries. 
Figures 9 and 10 illustrate designs for AOI metadata structures and associated 
caching algorithms. The image table is responsible for image area-of-interest 
management in a format designed to facilitate efficient AOI recall based on 
viewport overlay geometty. Efficient caching algorithms move disk-resident AOI 
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data into RAM as needed. This allows for user browsing or sequential access of 
tens of gigabytes of image data or more on standard PC workstations. 

Automated Fluorescence Microscopy for Cell Functional Analysis in a 
5 Cvtoplasm-to-Nucleus NFkB Translocation Study 

This section describes the results of a study to quantify die subcellular 
distribution of the regulatory protein nuclear factor kB (NFkB) in response to 
cellular stimulation. Upon stimulation, for example by proinflammatory 
cytokines, the NFkB's inhibitory subunit is phosphorylated and subsequently 

10 destroyed by proteasomes. Loss of the inhibitory subunit frees the modified 
protein's p65 regulatory subunit to translocate from the cytoplasm into the nucleus 
where it can activate defense genes in response to the external stimulation. 

Accurate and precise subcellular quantification of immunofluorescently 
labeled NFkB from microscope images provides a direct means of assessing the 

15 ability of a compound to inhibit cellular function in the presence of stimulation. 
This study examines cytoplasm-to-nucleus NFkB translocation in HUVEC cells in 
response to tumor necrosis factor a (TNFa ) s as an archetype of an image-based, 
cell functional assay. Because any other cellular compartments can be isolated and 
precisely quantified by specific imrnunofLuorescent labeling working in tandem 

20 with the image acquisition and analysis technology described above, the speed and 
precision advantages demonstrated here are generally available to cell functional 
assays. The experiments also allow direct comparison of the results achieved by 
practice of our invention using the platform of Figure 1 against previously- 
published results using different technology [G Ding, P Fischer, R Boltz, J 

25 Schmidt, J Colaianne, A Gough, R Rubin, and D Uiller. Characterization and 
quantitation of NFkB nuclear translocation induced by interleukin-1 and tumor 
necrosis factor-a. Journal of Biological Chemistry, 273 (44) :28 897-28905, 1998] 
as measured by fidelity and system throughput. We show comparisons of cell 
compartment measurements in a substantially reduced well-population standard 

30 deviation, which is a function of both measurement technology and inherent 
biological heterogeneity. We show how response variability translates into 
reductions in the number of cells required to reliably detect fractional responses, 
and also how it affects inhibitor response estimates (via a Monte Carlo simulation 
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of a model response IC50 distribution). Hie ability to segregate cell images based 
on morphological information leads to further improvements by focusing 
measurement on a homogeneously responding subpopulation. Scanning and 
analysis are then validated by analyzing the response of three BIPI translocation 
5 inhibitor compounds. Finally, system throughput is explained and estimated scan 
rates are given for typical experimental parameters. 
Experimental Procedures 

The cells used in these studies were primary human umbilical vein 
endothelial cells (HUVEC) obtained from Clonetics Corporation. The cells were 

10 maintained in culture under standard conditions and passaged using Clonetics' 
proprietary EGM medium and reagent system. Since they are not transformed, 
HUVECs, generally become senescent after nine or ten passages. Cells used in 
these studies were from earlier passages and did not exhibit senescence associated 
morphological changes. 

15 Prior to assay, cells were transferred to 96-well plates (Packard black 

ViewPlate) and incubated overnight to yield cell monolayers that were 
approximately 25% confluent. Plates used to determine a statistically optimal cell 
density for the assay were made by seeding wells at 5000, 2500, and 1000 cells 
per well and incubating overnight Three selected compounds were tested for 

20 inhibition of TNFa stimulated NF/eB translocation in HUVEC. The three 
compounds and controls were laid out in the plates as shown in Tables 2 and 3. 
Test compounds were directly diluted from DMSO stock solutions into medium to 
yield 60 mM compound solutions containing less than 0.7% DMSO. These 
solutions were serially diluted in medium to generate compound concentrations as 

25 low as 0.1 mM. After the medium was Temoved from the test plate, 100 jil 
aliquots of each compound dilution were dosed into triplicate wells. Unstimulated 
and TNFa stimulated control wells received 120 and 100 ^1 of medium, 
respectively. The cells were pre-incubated for 30 minutes at 37°C before they 
were stimulated by adding 20 ml of TNFa (1200 U/ml medium) to each well. The 

30 final stimulating concentration of TNFa used in this assay was 200 U/ml. After 1 5 
minutes incubation, the cells were fixed with 3.7% formaldehyde and then 
processed by using the NF*B kit reagents and protocol obtained from Cellomics 
(Pittsburgh, PA) to stain for cellular NFkB. In brief, cells are permeabilized, 
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washed, incubated with rabbit anti-NF«B primary antibody for 1 hour, washed, 
incubated with a secondary anti-rabbit IgG antibody-Alexa Fluor 488 conjugate 
for 1 hour, and washed again. Nuclei were stained with either Hoechst dye 
included in the secondary antibody solution or with DAPI. When used, DAPI was 
5 added in a final wash buffer solution at 100 to 400 ng/ml and kept in place during 
storage and examination. Translocation of NFkB from the cytoplasm to the 
nucleus was assessed by visual examination of sample wells with a confocal 
microscope system. 



Row 


Cells per Well 


A 


Blank 


B 


5000 


C 


5000 


D 


2500 


E 


2500 


F 


1000 


G 


1000 


H 


Blank 



10 Table 2: Template for 96 well NFaB control plate. Rows B£>,F unstimulated; rows C,E,G 
stimulated with 200 U/tnl TNFor for 15m. 
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Row 


Rows 1-3 


Rows 4-6 


Rows 7-9 


Rows 10 - 12 


A 


200 U/ml TNFa 


200 U/ml TNFa 


Unstimulated 


Unstimulated 


B 


0.1/M/A 




0.1/iWC 


Blank 


C 


0.3/rt/A 




0.3/d/C 


Biank 


D 








Blank 


E 




3/jMB 




Blank 


F 


10/iWA 






Blank 


G 


30/d/ A 


30/^fB 


30/zAfC 


Blank 


H 


60a<WA 


60/z/kfB 




Blank 



Table 3: Template for 96 well BIPI inhibitor plate. Row A controls: stimulated with 200 
U/ml TNFa for 15m (columns 1-6), unstimulated (columns 7-12). Rows B-H: triplicate 
wells of compounds A,B and C, at 0.1,03,1,3,10,30,60 /zW concentrations. 



20 The high-throughput microscopy platform illustrated above in Figure 1 

was used to automate fluorescent image acquisition and analysis of the NFkB 
assay. The primary system components consist of a standard fluorescent inverted 
microscope (Nikon TE-300); a Pentium in workstation; a control system, which 
houses the many electronic components; and software which controls the system 

25 for acquisition and analyzes the acquired images. The objective used in the scan 
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was a Nikon CFI60 20x .5 NA Plan Fluor. The spatially and temporally stabilized 
fluorescent light source is a uses a 100 watt Osram HBO 103 W/2 mercury arc 
lamp as its light source and connects to the microscope at the Nikon epi- 
fluorescent attachment. A standard Chroma Dapi/Hoechst filter cube with a long 
5 pass emission filter was used for the nuclear channel. A standard FITC filter cube 
with a long pass emission filter was used for the Alexa488-labeled NFkB channel. 
A Cohu 640x480 pixel 10-bit progressive camera (model 6612-3000; 9.9/zrc 2 
pixels) was used for image acquisition. 

Measurement of Distributions or Fractional Localized Intensities (FLI) of 

10 Subcellular Compartments 

Cellular substances are dynamically distributed during cellular responses. 
Although there may be hundreds of thousands to millions of different cellular 
substances, a cellular response can be measured by specifically labeling the subset 
of substances participating in the response. At any point in time, the combination 

15 of compartment identification and specifically labeled substances can be used to 
take a snapshot of the distributions. Image segmentation creates the image masks 
for each cellular compartment. For example, a least-squares-designed contrast- 
enhancing filter is used on each of the cellular compartment images to create 
segmentation masks. The nuclear segmentation is then used as a guide to separate 

20 overlapping compartments using tessellation. These image segmentation 
techniques work best on many types of cellular images. But other segmentation 
techniques can also be used to generate the segmented masks for each cellular 
compartment The measurement logic then loops through each set of pixels 
identified by the compartment masks and sums pixel intensities I(x; y). As an 

25 example, assume membrane, cytoplasmic and nuclear masks, m, c and n, 
respectively, with sizos N c , N n and N ro . The distributions over the compartments 
are defined as the fractional localized intensities of each compartment. The 
fractional localized intensity of the cytoplasm F c is 

30 The equations are analogous for the fractional localized intensities of the nucleus 
F n and membrane F m , and F c + F n + F m = 1. The physics of fluorescence image 
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formation leads to the use of integrated intensity to quantify cellular distributions. 
The emission intensity at pixel location (x, y) in an image plane is 

I(x,y) = )loQeudz = hQsu % z (2) 
o 

with incident (excitation) intensity I 0 , quantum yield Q, extinction coefficient e, 
5 local and column average fluorophore concentrations u and u\ and column 
thickness z. When the depth of field is greater than the cell, image formation 
effectively integrates the sample in z, the direction of the optical axis. When the 
depth of field i$ smaller than the cell as with high NA, confocal and multiphoton 
optics, explicit integration in z may more accurately represent the intensity. 
10 Assuming this integration has already taken place either optically with large 
depths of field or computationally with small depths of field, intensity 
measurements integrate in the orthogonal dimensions 

Yjfry) m J&L*Vt>y*fry) (3) 

fry) fcy) 
which has units proportional to moles fluorophore. F c becomes 

15 Fe - g (4) 

(Jf.y)6c id" (*.y)&* & 



with units of moles fluorophore. As before, the equations are analogous for the 
fractional localized intensities of the nucleus F n and membrane F m . Quantum 
yields are potentially compartment specific (i.e., vary with pH, ion concentration 

20 and other local physical parameters) and can be established experimentally as part 
of protocol development. Note that direct integration over the compartment image 
segments is preferred over average compartment intensity ratios or differences 
used in US 5,989,835 because a ratio of areas causes a bias factor that confounds 
direct interpretation unless N c = N n = N m , which can only be artificially achieved 

25 by discarding a majority of the cytoplasm and nuclear signal because typically 
N c ,N n > Nm. The cellular (or subcellular compartment) area is a function of height 
even when volume is fixed. The same amount of a cellular substance can be 
distributed over different volumes or areas. Averaging the intensities over the 
areas thus introduces a dependence on something that has nothing to do with the 

30 amount of the labeled substance. Note also that in equations (1) and (4) F c is the 
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fraction of the total integrated fluorescence over all of the compartments, rather 
man the fraction of one compartment to the other. 

Generalizing further, a compartment k in an arbitrary number t\ of 
compartments C has fractional localized intensity 

5 F f4 =^ and^F =1 (5a) 

i-1 (x,y)eC, 



10 



Similarly, any subset of compartments can be combined by addition to produce 
multi-compartment fractional localized intensities. For example, the fractional 
localized intensity of compartments 1 and 3 is 

and so on. 



Results and Error in the FLIC 

15 Fractional Localized Intensity Measurements on TNFa-induced NFkB 

Cytoplasm-Nucleus Translocation: Immunofluorescent staining of the NFkB p65 
regulatory subunit allows for easy characterization of the intercellular NFkB 
distribution by visual inspection of images acquired during scanning. The majority 
of fluorescence lies outside the nuclear area in unstimulated cells, while a 

20 substantial amount co-locates with the nuclear mask subsequent to maximal 
stimulation with TNFa (Figure 1 1). The panels of Figure 1 1 show translocation of 
cytoplasmic NFkB p65 regulatory subunit (unstimulated; left images) to cell 
nucleus (stimulated; right images) subsequent to stimulation of HUVEC cells with 
TNFa. Cell densities were 5000 cells/well (top; rows B,C), 2500 cells/well 

25 (center; rows D,E) and 1000 cells/well (bottom; rows F,G). A clear majority of 
Alexa488-labeled p65 resides ubiquitously in the cytoplasm before stimulation, 
while bound to the inhibitory subunit. Stimulation with TNFa causes 
phosphorylation and separation of p65 from the inhibitory subunit, allowing for 
translocation to occur. Only » 20% of gross p65 translocates upon maximal 

30 stimulation (qualitative visual inspection of images tends to overemphasize the 
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higher fluorophore concentration in the nuclei of stimulated cells, and suggest a 
greater translocation). Note how background fluorescence varies between 
wells. As stipulated in previous reports, the traction of p65 to translocate on 
stimulus represents « 20% of the gross amount contained in unstimulated cell 
5 cytoplasm. [Ding et al.]. Effects due to background fluorescence are corrected by 
estimating and subtracting title mean background image intensity, which was 
found to be approximately constant from image-to-image, but can vary from well- 
to-well. Therefore, corrections were performed on a per well basis. 

Co-location of p65 immunofluorescence with the nuclear mask can occur 

10 in unstimulated cells as image formation integrates the p65-bound fluorophore 
emission through the three dimensional sample parallel to the optical axis, and 
therefore is susceptible to contribution above or below the nuclear volume. 
Similarly, stimulated cell images carry residual cytoplasmic p65 
immunofluorescence contributions co-located with the nuclear mask. Although 

15 this effect may be small, it introduces a bias in fractional localized intensity 
measurements. For applications where this effect is not negligible, a correction 
procedure could be developed by correlating nuclear and cytoplasm absolute 
integrated intensities in unstimulated cells to establish the average contribution of 
cytoplasm resident fluorophore to nuclear measurements as a function of 

20 cytoplasmic intensity. 

Figure 12 illustrates image acquisition processing stages in the system of 
Figure 1. To obtain the images of Figure 12, the nuclear channel (Hoechst 33342 
or DAPI) was scanned over the 10 x 10 field area in each well to populate the 
image table with nuclear intensity images (upper left). The upper right panel 

25 shows the associated nuclear mask images produced by filtering and intelligent 
thresholding of the in-focus acquired images. Subjectively, masks are consistent 
with the quantitative study conducted by Price, Hunter, and Gough, which 
demonstrated accurate and precise nuclear masks despite significant local image 
variations and a wide range of nuclear intensities from cell-to-cell. Such variations 

30 can confound simple techniques, because the accuracy and precision of the 
resulting masks would substantially bias localized intensity measurements. Using 
human-segmented training and testing data, this ffltering technique has been 
shown to achieve accuracy >93% correct classification of pixels and precision of 
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less than about 1% coefficient of variation [Price, Hunter, and Gough]. This is 
acceptable, and perhaps a limiting performance level, since human-human 
agreement in manually segmented examples is likely to be comparable. The scan 
area tessellated by nuclear centroids is shown in Figure 12 (lower left). This 
5 tessellation structure is then used to dynamically assemble and associate the 
quantitative p65 channel (Alexa488) with the individual nuclear images (lower 
right). Reliable alignment between channels provides the basis for accurate 
fractional localized intensity measurements. 

In Figure 12, individually addressable images of Hoechst 33342- or DAPI- 

10 labeled cell nuclei were identified and extracted from the image stream produced 
during scanning. These images were saved to disk using an image table data 
structure (Figure 9) that maintains the relative positions of each nuclei (upper 
left). Binary nuclear mask images (upper right) show the accuracy of the nuclear 
segmentation processing achieved, which is critical for compartment localized 

15 measurements. Voronoi tessellation of nuclear centroids (lower left) partitioned 
the data structure and scan area into polygonal regions used to correlate 
subsequent channels with individual nuclei (lower right), in order to maximize the 
signal from every cell. 

Maximal NFkB translocation following 200 U/ml TNFa stimulation of 

20 HUVEC cells was quantified by measuring fractional localized intensity in the 
nucleus (FLIN) (fractional localized intensity in the cytoplasm (FLIC) is 
equivalent in the two-compartment translocation model used in this study; FLIC + 
FLIN = 1) for every cell completely within the scan area (Table 4). At the 
different jcell densities, FLIN sample mean, standard deviation (a ), standard error 

25 (SE) and coefficient of variation (CV) were calculated in 12 replicate wells (Table 
4). Well-to-well sample statistics for all six density x stimulation treatments of the 
within-well sample mean FLINs are reported in Table 5, and illustrated by the 
constant horizontal lines in Figure 13. Figure 13 illustrates the population statistics 
for TNFa-induced NFxB activation in HUVEC cells. FLIN response (ji±2cr) is 

30 plotted per well for unstimulated (lower FLIN) and maximally stimulated 
(increased FLIN) cells (200 U/ml TNFa ) for 5000 cells/well (top; rows B,C), 
2500 cells/well (center; rows D,E) and 1000 cells/well (bottom; rows F,G). The 
horizontal lines give across row mean-of-means ± 2 x SE. 
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Table 4: Well-wise and row-pooled FUN statistics. CVs defined as (std dev)/mean 
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0.004564 


0226347 


11F 


178 


0.156161 


0.051168 


0.003858 


0.320902 


11G 


317 


0.341355 


0.076450 


0304294 


0223981 


12F 


160 


0.158985 


0.049600 


0.003921 


0311977 


12G 


235 


0347823 


0.089314 


0.004322 


0.109394 


RowF 


2087 


0.1564 


0.0570 


0.0012 


0.3842 


RowG 


3606 


03537 


0.0779 


0.0013 


02203 



We measured the occurrence of 18.18% - 19.68% (18.80% 
5 average) translocation of labeled NFkB, which we calculated by averaging the 12 
row replicate well FLIN sample means per row, and differencing stimulated and 
unstimulated averages at each cellular density. Heterogeneity of cellular response 
to TNFa stimulation is apparent by visual inspection of acquired images and 
summarized in the ±2a confidence interval widths in Figure 13 (the larger interval 
10 around each well mean). Stimulated and unstimulated well populations have 
significant overlap in response. However, die average response of 1 8.80% is large 
compared to the FLIN standard errors which are calculated within wells in Table 4 
arid shown graphically in Figure 13 (smaller interval about each well mean). For 
example, the average SE in row E, 3.46 x 10~ 3 , gives a 4 SE empirical response 
15 resolution of 0.0139 FLIN increments, or about 7.4% increments over the 1 8.80% 
signal dynamic range. 

By analyzing the spread of well-average FLIN measurements from well- 
to-well, we can assess well-to-well variability and repeatability of the experiment. 
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The row aggregate FLIN sample means, calculated by averaging the 12 replicate 
well sample means in each row, have a standard error of 1.6 x 10" 3 - 7.20 x 1CT 3 
(CV of 2.34% - 13.93%) (Table 5). This well-to-well sample mean variability is 
visualized by the horizontal aggregate mean and ± 2SB lines in Figure 13. 

5 Deviations of the individual well responses from this region show the effect of 
protocol perturbations in individual well environments. Although statistically 
significant differences exist, the physical or scientific significance in terms of 
effect on measured translocation is typically small in tins experiment, indicating 
the possibility of limiting the number of replicate wells required in an optimized 

10 and controlled protocol. As a quantitative example, a trimmed between- well CV 
for row D in Table 4 reduces variability from 13.93% to 4.76%. Note that row D 
had the highest CV and that the numbers were trimmed to the six median wells 
(using the assumption that those wells with the most human preparation error were 
at the tails of the distribution). Wellplate replicate layout to mitigate the effect of 

15 well-to-well variability must be determined during specific assay development. 

Table 5: Well-to-well statistics of well-average FLIN values (n=12). Mean, standard 
deviation, standard error and coefficient of variation of the FLIN well sample means 
across each row. 



Row . 


mean 


<T 


$E 


CV 


B 


0.1346 


0.0055 


0.0016 


0.0412 


C 


0.3199 


0.0198 


0.0057 


0.0618 


D 


0.1798 


0.0250 


0.0072 


0.1393 


E 


0.3616 


0.0085 


0.0024 


0.0234 


F 


0.1564 


0.0110 


0.0032 


0.0704 


G 


0.3532 


0.0184 


0.0053 


0.0521 



20 

Automatic Determination of Cells/Well Required to Reach a Minimum 
Significant Response 

Theory of Dose Response Resolution: To underpin the empirical dose 
response resolution theoretically and to properly stage a baseline comparison of 

25 the data with results produced using a different technology, an objective definition 
of the meaning of response resolution and its dependence on measurement 
fidelity, sample size and error controls is needed. Inhibitory responses are 
estimated by nonlinear regression from a collection of experimental well-average 
measurements distributed across a range of inhibitor concentrations. Sources of 

30 variability in curve estimates include high population variability (heterogeneity) in 
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response, well-to-well variability and inadequate sampling of inhibitor 
concentration. These factors are naturally dependent; measurement precision 
places limitations on measurable differences in response. Experiment design 
involves optimizing jointly over well datapoint replicates and individual well 
5 measurement quality (a function of the number of cells per well). 

A direct measure of system resolution is the minimum statistically 
significant response that can be reliably measured. An equivalent measure of 
system performance is, for a specific minimum significant response implicated by 
choice of inhibitory response model, the number of cell measurements required at 

10 each inhibitor concentration to reliably estimate the response model parameters. 
For two populations of responding cells, we want to determine the minimum 
difference in mean FLIN that can be reliably detected as a measure of system 
resolution. For this, we define a two-sample hypothesis test 

15 H 0 :m-M2 = 0 (6) 

ft:/4-/fc>0 (7) 

where n\ and ju 2 are the true (unobserved) mean FUN responses, and Hq and H a 
are the null and alternative hypotheses. This is an upper-tailed test with decision 
20 rule z > Za> where z = (x\ - X2) / (cr V lln) is the unit variance normalized test 
statistic for sample means *i and x 2 , and z a is the threshold for rejection of an Or 
level test, 

\ a=Pr(z>^|H 0 ) = l-£^^ = l-O(z a ) (8) 

25 The type I error probability (a) is the likelihood that two identically inhibited 
samples will, by natural variations, show a imnimally significant response 
difference in measurements). The probability of Type I errors is controlled by 
fixing its value in advance, e.g. a = 0.05, as determined by assay requirements. 
Similarly, the type II error probability (fi) is the likelihood that two samples with 

30 minimally different responses will not show a significant difference in 
measurements, and it is similarly determined and fixed by assay requirements. To 
relate these assay parameters and derive a measure of precision, we express the 
probability ^mathematically, assuming an absolute minimum significant response 
(MSR) Aji 
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Pr(z<z a |// l - y « J =A//)= y 5(A / u) (9) 

K a ^ < *-d^-*^)-^ <r - , ° (io) 

5 This expresses >£as a function of A/z, cr (FLIN population standard deviation) and 
n (sample size). The assumption that both populations share the same a is 
reasonable because in the limit, the minimum detectable difference approaches the 
same population (formulae for the case o\ * c% are easy to derive, however, when 
we need an estimate of sample size requirements for detecting very distinct 
10 responses). By specifying ^as we did for a, e.g. a = fi = 0.05, we control type II 
error probability and fix zp 

fi (12) 
AM = (z a -z fi )cr>/2hl (14) 



15 



MSR is expressed as a fraction of the dynamic range of the assay or experiment 
MSR= = (5 LZfg W2gj 



For specified MSR and assay parameters, the minimum corresponding sample size 
20 is 

n J<hZE&$ ( i6) 

^ A/iMSR ) V } 

To gain an understanding of this precision measure, let a = fi= 0.05, then 
MSR is the minimum significant dose response relative to the assay dynamic 
25 range such that 

1. There is a 95% chance that when there is no true difference 
between population mean responses, z £ z a or x\ - x 2 ^ z a <r V2/« = A// 12 
(no significant difference measured between sample means), and 
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2. There is a 95% chance that when ii\ - = A// > 0, *i -* 2 > A/4/2 
(significant difference measured between sample means) 

5 Therefore, specification of MSR and protocol parameters allows one to 

make objective guarantees about the minimum required sample size to control the 
variability of dose response point measurements so that they are unlikely to 
overlap (according to our specification of o, A family of n(MSR) curves are 
plotted in Figure 14. The average population a in row E is a = 8.1453 x 10" 2 , 

10 which would allow a 7.4% MSR for n * 742. 

The control of type I and II error probabilities in this scheme has different 
implications than most traditional hypothesis testing by virtue of its use as a 
measure of experimental precision. A type I error in an inhibitor assay indicates 
that the replicate measurements produced responses outside the experimentally set 

15 control interval. A type II error indicates that nonreplicate measurements 
(different inhibitor concentrations) produced measured responses within the same 
control interval. Nonlinear regression to an inhibitory response model will not 
discsrirninate the different meanings of these errors; only that they both produce 
datapoint scatter and complicate parameter estimation. Therefore, for experiments 

20 involving inhibitory response, it is suggested to constrain a = /?. 

Figure 14 is a curve family showing theoretical dose response resolution 
for <t= {.02,.04,.06,.08,.1,.12,.14} (ascending, a= 0.05 (Type I error probability), 
0.05 (Type II error probability)). In this figure, the number of cells required (y 
axis) is plotted against the fraction of total NFkB translocation that is resolved (x 

25 axis). Measurements made according to this invention fall between cr= 0.06 and a 
= 0.08 (between third and fourth curves from the bottom) as reported in Table 4. 
These relationships allow for easy visualization of the effect of improved or 
degraded measurement fidelity and resolution on cell number requirements. 

Sample Size Requirements: Ding et al. [G Ding, P Fischer, R Boltz, J 

30 Schmidt, J Colaianne, A Gough, R Rubin, and D Uiller. Characterization and 
quantitation of NFkB nuclear translocation induced by interleukin-1 and tumor 
necrosis factor-a. Journal of Biological Chemistry, 273 (44):28 897-28905, 1998] 
report sample size requirements for similar NF/eB translocation experiments 
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carried out on an imaging platform. To demonstrate the superior cell measurement 
fidelity of the technology reported here, Table 6 compares the results published by 
Ding et al. against similar calculations based on the raw measurements data in 
Table 4 and the derivation in the previous subsection. Table 6 provides 
5 comparative estimates of the number of cellular measurements needed to measure 
a given level of stimulation response (from 2% - 100% of the total translocation 
amplitude of about 19% FLIN change) as a function of NFaB fluorophore 
translocation at 95% and 99% type I confidences (a =0.05,0.01; P =0.20). The 
parameters of the experiment were duplicated here to provide a direct comparison 

10 on cellular measurement fidelity. Additional response level 2%-10% and an 
additional column (a -J5 =0.001) were added to further demonstrate results 
according to our invention. In the general population, the same number of cells 
required to measure a 100% response (19% translocation event) using the 
technology reported in Ding et al. allowed a measurement of about 25% response 

15 using the principles of our invention. Reasonable dose response curve estimation 
may require a 20% resolution or better, which at a =0.05, =0.20 would require 
about 12x fewer measurements to achieve using the principles of our invention. 
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% Maximum 


Ding et al. 


Invention 


Ding et ah 


Invention 


Invention 


Response 


(0.05,0.20) 


(0.05.0.20) 


(0.01.0.20) 


(0.01,0.20) 


(0.001,0.001) 


2% 


Not given 


3970 


Not given 


6445 


24528 


5% 


Not given 


640 


Not given 


1039 


3952 


10% 


Not given 


162 


Not given 


263 


1000 


20% 


>500 


42 


>500 


68 


256 


30% 


388 


<30 


>500 


31 


117 


35% 


220 


<30 


331 


<30 


87 


45% 


141 


<30 


213 


<30 


54 


55% 


99 


<30 


149 


<30 


37 


65% 


73 


<30 


111 


<30 


<30 


70% 


57 


<30 


86 


<30 


<30 


80% 


45 


<30 


68 


<30 


<30 


90% 


37 


<30 


56 


<30 


<30 


100% 


31 


<30 


47 


<30 


<30 



Table 6: Comparison of the mimmuni number of cells required to measure percent of the 
total (maximum stimulation) NFxB translocation (19% FLIN change) obtained with the 
invention with those reported in Ding et al., at controlled error probabilities (type I, type 
5 II). The response column gives the percent response of the full dynamic range of the 
measurement (18.8%). Our calculations assume well sample mean measurements 
Gaussian distributed, justified by Central Limit Theorem, for n > 30. For n < 30, student's 
t distribution is appropriate. However, for simplicity, a rninimum of 30 cells is always 
measured. Our calculations assume A/i = 0.1818 and <r linearly interpolated between 
10 0.0650 (uristimulated) and 0.0800 (stimulated). 

Monte Carlo Response Curve Estimates: To quantify the effect of 
measurement precision directly on estimated inhibitor response parameters, a 
Monte Carlo simulation was carried out using the model response shown in Figure 

15 15 which shows a theoretical dose response curve with ICso = 8.895 x 10" 5 , and 
Mnax5 Anin and <? specified by the measurement procedure to be simulated. For 
Figure 15, // max = 1.0, and = 0.0. The circled points indicate die 11 inhibitor 
concentrations used in the simulation. These eleven sites were chosen 
logarithmically across the inhibitor concentration range to which Gaussian 

20 random noise with deviation cr I was added to create 500 simulated 
experimental outcomes. Figures 16 and 17 illustrate the simulated datasets, each 
nonlinearly regressed to the model in Figure 15 using the Gauss-Newton 
procedure [See, for example, WH Press, SA Teukolsky, WT Vetterling, and BP 
Flannery. Numerical Recipes in C, Second Edition. Cambridge University Press, 

25 New York, NY, 1992.]. 

The panels of Figure 16 comparatively illustrate worst case simulated dose 
response experiments achieved by way of empirical curve ICso distribution for 
well-well mean FLIN response. The top three panels show estimated ICso 
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scatteiplots with true value. The middle three panels are ICso frequency 
histograms (compare with the 5 and 95-percentile values in Table 7). The bottom 
panels illustrate a sample of 20 estimated dose response curves from the total 
simulation of 500. The leftmost three panels show a 90% IC$o confidence interval 
5 in the worst case for our invention with full population, with a = 0.093. However, 
our worst case is approximately 3.62x better (narrower) than the rightmost three 
panels (Ding et al.). Results for our G\ subpopulation (center; worst case a = 
0.066) exploit the homogeneity of this subpopulation to give a 90% ICso 
confidence interval that is 5.07x better. Experimental parameters for the 

10 simulation are given in Table 7. 

In Figure 17 best case simulated dose response experiments illustrate 
empirical curve ICso distribution for well-well mean FUN response. Shown are 
estimated ICso scatterplots with true value (the top three panels), ICso frequency 
histograms (the middle three panels; compare with 5 and 95-percentile values in 

15 Table 7); and a sample of 20 estimated dose response curves from the total 
simulation of 500 (in the bottom three panels). The 90% IC$q confidence interval 
for the left hand side (simulation of results produced by the invention; best case 
full population, a = 0.040) is approximately 8.8 Ix better (narrower) than the right 
(simulation of results produced according to Ding et al.). Results for the 

20 invention's G\ subpopulation (center, best case, <r = 0.021) exploit the 
homogeneity of this subpopulation to give a 90% ICso confidence interval that is 
16.3x better. Experimental parameters for the simulation are given in Table 7. 

Standard errors for response point estimates were 0.0093 (worst case Table 
4 standard deviation), 0.004 (best case Table 4 standard deviation) and 8.354 

25 (calculated from the cell numbers for statistical significance reported in Ding et al. 
and replicated in Table 6); both standard errors assume n = 100. Table 7 reports 
the 90% Monte Carlo confidence interval widths for all three outcomes, 5,8585 x 
l6~ 5 , 2.4063 x 10" 5 and 2.1202 x lO^ 4 , showing 3.62x (worst case) and 8.81x (best 
case) stronger confidence in the ICso estimates obtained for our invention. Further 

30 simulations shown in Figures 18 and 19 demonstrate the sensitivity of ICso 
estimate precision on the sampling of experimental concentrations. 

For example, in Figure 18, Monte Carlo simulations of dose response 
regression reveal that ICso estimates becomes more sensitive to cell-measurement 
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precision as the range of experimental concentrations diminishes. The panels in 
the left column show the model sigmoids used in the experiments, ranging from 
11 concentrations (top; circles) to 3 concentrations (bottom; circles). 20 of 500 
runs are graphically plotted for the invention (middle column; tr = 0.0665, A// - 
5 .18), and Ding et al. (right column; a = 83.545, A/z= 56). 

Refer further to Figures 19a and 19b, which show IC$q Monte Carlo 
confidence interval width for Ding et al. (circles; cr = 83.545, Ap = 56) and the 
invention (plusses; a = 0.0665, A// = .18) versus number of sample 
concentrations (3,5,7,9,11). Figure 19b is zoom of Figure 19a. Manifestly, 
10 measurement precision becomes more important as sampling and coverage of the 
relevant concentration ranges decreases. The elbow in the graph corresponds to a 
regression breakdown, showing robust regression results maintained by the 
measurement precision of the invention. 



stddev 


range 


CV 


ICsq 5% 


/C 50 95% 


A/Cso 


0.093 


[.15..33] 


51% 


6.2930 X 10* 


1.2151 X10" 4 


5.8585x10* 


0.040 


[.15,.33] 


22.2% 


7.8407X10* 


1.0247 xlO" 4 


2.4063x10* 


0.066 


[.15..33] 


35.1% 


7.0982x10* 


1.1109X10" 4 


4.1808 x 10* 


0.021 


[.15..33] 


11.7% 


8.2296 x 10* 


9.5332 x 10* 


1.3036x10* 


83.545 


[0,551 


152% 


3.3819x10* 


2.4584 X 10- 4 


2.1202 X10" 4 



15 Table 7: Monte Carlo simulated dose response parameters and estimation results for 500 
runs at n=100 cellular measurements, showing A/C 50 90% Monte Carlo confidence 
interval variability under different measurement precisions. Well-to-well mean FL1N 
response standard errors: 0.0093 (invention; worst case full cellular population), 0.0040 
(invention; best case full cellular population), 0.0066 (invention; worst case G x 

20 subpopulation), 0.0021 (invention; best case G\ subpopulation), and 8.3545 (Ding et al.). 

Homogeneous Subpopulation Analysis 

To illustrate the ability of the invention to estimate quantities of interest 
conditionally on specific morphologic subpopulations, an empirically determined 

25 classification scheme was developed to isolate G\ cells from the S t M t Gi cells and 
fluorescent artifacts. A six-rule classifier (Table 9) was developed from the set of 
standard cellular measurements (Table 8) taken in the Hoechst 33342 or DAPI 
nuclear channel. Analysis of translocation experiment data was repeated for this 
G\ subpopulation at 2500 cells/well (rows DJB) and is reported in Tables 10 (raw 

30 well data), 11 (statistics of well means) and Figure 20 in which the translocation 
response of the G\ subpopulation is plotted. The row aggregate (average of all 
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well average subpopulation FLIN responses) are 0.1418 (unstimulated) and 
0.3317 (stimulated). 

The improved homogeneity of cell response is clearly seen in reduced ±2cr 
population intervals (compare Figure 13, middle, with Figure 20). The 
5 subpopulation defined by Table 9 responds about twice as uniformly as the full 
cellular population, while the mean response is nearly identical. Well-to-well 
variability appears consistent with the full population analysis. 

Table 12 reports sample size requirements for the NFkB translocation 
experiment assuming measurements are restricted to the G\ subpopulation. The 

10 improvement over the Ml population requirements at 20% response resolution 
(giving the minimum of 5 samples used in a dose response regression), is 42 cells 
to < 30 cells (29% reduction) at <z= 0.05, ^= 0.20; 68 to < 30 (56% reduction) at 
a = 0.01, J3= 0.20; and 256 to 47 (82% reduction) at a = 0.001, £ = 0.001. The 
subpopulation analysis reduced cell number requirements by the 94% over data 

15 reported by Ding et al. (a= 0.05 and 0.01 for 20% response case). 

The Monte Carlo simulations were* also repeated using the G\ 
subpopulation statistics. Experimental parameters and numerical ICso 90% 
confidence interval widths given in Table 7 show a 28.6% - 45.8% reduction in 
interval width over full population analysis for the same number of cells n = 100. 

20 Comparison to the simulations based on Ding et al. (as described above) show a 
5.07-fold improvement in the worst case, visualized in Figure 16 (compare middle 
and right columns), and a 16.3-fold improvement in the best case, visualized in 
Figure 17. 

These results indicate that improved response can be exploited by reducing 
25 the required scan area and keeping a specified resolution (optimizing for system 
throughput), or fixing the scan area and thus throughput, but discarding cellular 
measurements in an intelligent manner to select homogenous response populations 
and improving response resolution. These advantages will only be available when 
the fraction of cells that define the subpopulation exceeds a threshold defined by 
30 the response resolution equations. For example, a scan area can be reduced from 
the minimum required to measure the full population np whenever 

n,<n F/ or^ = 4</ 0?) 
n F cr F 
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where / is the fraction of cells comprising the homogenous subpopulation. When 
this is true, the scan area can be reduced by some number of fields dependent on 
cellular density and /. Thus, subpopulation analysis can directly affect system 
throughput whenever a homogenous enough and large enough subpopulation 
5 exists. Similarly, minimum significant response is improved whenever 
MSRy — MSR F > 0 or cr s - a F ji > 0 (18) 
These rules can be validated by examining the curves in Figure 14. 



Table 8: Standard cellular measurements 



measurement definition 


2 


serial number 

x coordinate of nuclear bounding box center 


i 

x* 


3 


y coordinate of nuclear bounding box center 


y* 


4 


x size of nuclear bounding box 


dur* 


5 


y size of nuclear bounding box 


An 


.6* 


nuclear area 


o» " teuickar pixels 


7 
8 


root nuclear area 
integrated nuclear intensity 


0 


9 


average nuclear intensity 




10 






11 


5th percentile of nuclear intensity 




a 


25th percentile of nuclear intensity 




13 


50th percentile of meteor intensity 


/7»-50%Kfty;] 


14 


75th percentile of nuclear intensity 


/*-75%f/.fty;] 


15 


95th percentile of nuclear intensity 




16 


maximum nuclear intensity 


p lM -rnax[/,frX>3 


17 


interquartile range of nuclear intensity 




18 


variance of nuclear intensity 




19 

20 


standard deviation of nuclear intensity 
third central moment of nuclear intensity 


tr. 


21 

22 


3* root of 3* central moment of nuclear intensity 
fourth central moment of nuclear intensity 




23 
24 


4* root of 4* central moment of nuclear intensity 
nil clear perimeter 


mW 

/. - foodear pixels (8-oonuected) 


25 
26 


nuclear wiggle 

perron! feed nu clear wiggle 


w-l/a. 


27 
28 
29 


charmel-j integrated nuclear intensity 
channcl-j integrated cytoplasmic intensity 




30 


channeH FL&V 
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Table 9: Six-rule definition for G\ subpopulation based on cellular measurements defined 
in Table 8 





rule 


e#ec/ (property of cells excluded) 


1 


22.73 £ yfc £ 31.85 


nuclear area is atypical of resting cell nuclei 


2 


io 5 <;/,<; 4xio s 


little or no NFkB fluorophore or large, bright fluorescent artifacts 


3 


Wfl,3 < 0 


dividing, have unusual nuclear shapes or are dying 


4 


.n<w^.2 


two or more aggregate nuclei or irregularly shaped nuclei 


5 


io^/e„^6o 


irregular nuclear textures 


6 


w<4.8 


irregularly shaped nuclei 



5 Table 10: Well-wise and row^jooled FLIN statistics of G\ subpopulation, as defined by 
the classifier in Tables 8 and 9. CVs defined as (std dev)/mean. These results show 
significant reduction in cell response variability, indicating a much more homogenously 
responding subpopulation amongst the total cell population in each well. See Figure 20 
for a visualization of this data. 



Well 


n 


mean 


stddev 


std err CV 


Well 


n 




stddev 


std err 


CV 


DI 


53 


0.139494 


0.026964 


0.003704 0.193301 


El 


204 


0.328162 


0.064145 


0.004491 


0.195469 


D2 


79 


0.135942 


0.021481 


0.002417 0.158018 


E2 


236 


0330019 


0.058368 


0.003799 


0.176861 


D3 


85 


0.142980 


0.029554 


0.003206 0.206699 


E3 


257 


0329866 


0.056664 


0.003535 


0.171778 


D4 


128 


0.137386 


0.027846 


0.002461 0.202687 


E4 


230 


0350588 


0.066655 


0.004395 


0.190124 


D5 


70 


0.154410 


0.027970 


0.003343 0.181141 


E5 


269 


0349991 


0.060755 


0.003704 


0,173590 


D6 


75 


0.153748 


0.027003 


0.003118 0.175630 


E6 


255 


0324779 


0.055980 


0.003506 


0.172363 


D7 


88 . 0.142782 


0.025264 


0.002693 0.176942 


E7 


241 


0337747 


0.058013 


0.003737 


0.171766 


D8 


(02 


0.14S129 


0.029421 


0.002913 0.202720 


E8 


227 


0327979 


0.060758 


0.004033 


0.185250 


D9 


72 


0.153787 


0.025640 0.003022 0.166723 


E9 


290 


0322116 


0.061913 


0.003636 


0.192207 


D10 


114 


0.127378 


0.023832 


0.002232 0.187094 


E10 


278 


0325874 


0.058762 


0.003524 


0.180322 


Dll 


106 


0.136946 


0.026416 


0.002566 0.192890 


Ell 


204 


0331738 


0.064243 


0.004498 


0.193656 


D12 


107 


0.131305 


0.025726 


0.002487 0.195925 


E12 


162 


0320942 


0.047253 


0.003713 


0.147232 


Row 
D 


1029 


0.1406 


0.0277 


0.0008 0.1969 


RowE 2853 


0.3318 


0.0606 


0.0011 


0.1821 



10 



Row 


mean 


stddev 


std err 


CV 


D 


0.1418 


0.0088 


0.0026 


0.0624 


E 


0.3317 


0.0097 


0.0028 


0.0295 



Table 1 1: Well-well statistics of well subpopulation average FLIN (n=12), as defined by 
the classifier in Tables 8 and 9. Mean, standard deviation, standard error and coefficient 
of variation of the FLIN well sample means across each row. Although standard error is 
20 increased due to the reduced number of participating cells in the subpopulation average, 
well-well variability is consistent with the full population data reported in Table 5. See 
Figure 20 for a visualization of this data. 
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% Maximum 
Response 


Ding et al. 
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86 
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90% 


37 


<30 
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<30 


<30 


100% 


31 


<30 


47 


<30 


<30 



Table 12: Minimum number of G\ subpopulation cells required to resolve percent of the 
total NFkB translocation at maximum stimulation, reported here and in [6], at controlled 
error probabilities (type I, type H). Our calculations assume A/i = 0.1818 and cr linearly 
interpolated between 0.0267 (unstimulated) and 0.0542 (stimulated). See Table 6 caption 
10 for additional assumptions. 

Figures 21a and 21b illustrate inhibition of NFaB nuclear translocation by 
BIPI compound A. In these figures, Fractional Localized Intensity in the Nucleus 
(FLIN) is plotted versus inhibitor concentration for full cell population found in 

15 the 10 x 10-field scan area (Figure 21a), and in a 100-cell population (Figure 21b). 
Cell preparation is as described above. Figure 21a illustrates total cell population 
analysis with inhibition of NFkB nuclear translocation in HUVBC cells where 
Ki(Cmpd_A)=6.95Er-05, Rmax=3. 8 61^-01, and Rmin=0.212. Figure 21b 
illustrates 100-cell population analysis with inhibition of NFkB nuclear 

20 translocation in HUVEC cells where Ki(Cmpd_A)=8.09E-05, Rmax=4.21E-01, 
and Rmin=0.212. 

Figures 22a and 22b illustrate inhibition of NFkB nuclear translocation by 
BIPI compound B. In these figures, Fractional Localized Intensity in the Nucleus 
(FLIN) is plotted versus inhibitor concentration for foil cell population found in 

25 the 10 x 10-field scan area (Figure 22a), and in a 100-cell population (Figure 22b). 
Cell preparation is as described above. Figure 22a illustrates total cell population 
analysis with inhibition of NFkB nuclear translocation in HUVEC cells where 
Ki(Cnrpd_B=5.08E-05, Rmax=3.77E-01, and Rmin=0.212. Figure 22b illustrates 
100-CelL population analysis with, inhibition of NFkB nuclear translocation in 

30 HUVEC cells where Ki(Cmpd_B)=6.65E-05, Rmax=4.12E-01, and 
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Rmin=0.212. Figures 23a and 23b illustrate inhibition of NFxB nuclear 
translocation by BIPI compound C. Fractional Localized Intensity in the Nucleus 
(FLIN) is plotted versus inhibitor concentration for full cell population found in 
the 10 x 10-field scan area (Figure 23a), and in a 100-cell population (Figure 23b). 
5 Cell preparation is as described above. Figure 23a illustrates total cell population 
analysis with inhibition of NFkB nuclear translocation in HUVEC cells where 
Ki(CmpdJ>1.01E^06, Rmax=3.99E-01, and Rmin=0.212E-01. Figure 23b 
illustrates 100-cell population analysis with inhibition of NFkB nuclear 
translocation in HUVEC cells where Ki(Cmpd_C)=1.36E-06, Rmax=4.20E-01, 

10 and Rmin=0.212. 

Figure 24 illustrates inhibition of NF/eB nuclear translocation by BIPI 
compound B in the Gi cell subpopulation defined by Tables 8 and 9 with 
Fractional Localized Intensity in the Nucleus (FLIN) plotted versus inhibitor 
concentration for full cell population found in the 10 x 10-field scan area. Cell 

15 preparation is as described above.. In this figure, Gl cell population analysis is 
shown with inhibition of NFkB nuclear translocation in HUVEC cells where 
Ki(CrnpdJB)=3.70Er-05, Rmax=3.82E-01, andRmin=0.212. 
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Compound R max (FUN) (SE) (FLJN) (SE) IC S0 (FLJN) (SE) 

A 0.386 (8.17 x 10" 3 ) 0.212* 6.9S x 10' 5 (2.2 x 10 5 ) 

A (n=100) 0.421 (6.70 x 10" 1 ) 0.212* 8.09 x 10* 5 (1.8 x 10" 5 ) 

B 0.377 (3.63 x 10^) 0.212* 5.08 x 10' 5 (6.8 x 10 6 ) 

B (n=100) 0.412 (5.85 x 10" 3 ) 0.212* 6.65 x 10* 5 (1.3 x lO" 3 ) 

B ((?,) 0.382 (3.28 x 10 J ) 0.212* 3.70 x 10" 5 (4.1 x 10"*) 

C 0.399 (6.90 x 10" 3 ) 0.212 (3.78 x 10°) 1.02 x 10* (1.6 x 10" 7 ) 

C (n=100) 0.424 (8.96 x IP" 3 ) 0.212* 1.15 x 10* (2.2 x 10' 7 ) 

Table 13: Inhibitor response parameter estimates and standard errors, given by nonlinear 
regression analysis. Umty-slope-at-inflection sigmoid models were fit to triplicate wells at 
seven concentrations to determine the response characteristics for BM inhibitor 
5 compounds A3 and C. Each compound was evaluated using the full population of 
cellular measurements found in a 10 x 10-field scan area within each well, and also 
limiting the number of cells measured to 100 in each well. R^ could not be established 
independently in compounds Afi (see Figures 21a/21b and 23a/23b), and so R m b was 
fixed at 0.212 for these compounds (all compounds with fixed Rmi* marked with 
10 asterisks), as determined by the full population analysis of compound C. The G x 
subpopulation response was also measured for compound B. Cells were prepared as 
described above. 

Analysis of Inhibitor Compounds: In another wellplate, cells were treated 
15 with different concentrations of three inhibitor compounds (A,B and C), and then 
analyzed to assess die inhibition of NFkB translocation. In these experiments, 
both the full number of cells found in a 10 x 10 scan area, and a fixed 100 cell set 
were measured and compared. All three compounds responded clearly (Figures 
21a-23b), with nearly identical uninhibited translocation (R^) in the range 
20 0.3772 - 0.4239. Compounds A and B failed to plateau at maximum inhibitor 
concentration, and response estimates foiled to converge with a floating 
parameter. Compound C achieved maximum inhibition with R m m ~ 0.212 (full cell 
population) and 0.220 (100 cell population); R^ was subsequently fixed to 
achieve response convergence on compounds A and B. Table 13 summarizes all 
25 estimated response parameters calculated by nonlinear optimization (unity slope at 
inflection is assumed in the response model). ICsqs varied 1.02 xlO^-l.lSxlO" 6 
for compound C, and by a factor in the approximate range 32 - 80 greater for 
compounds A and B. 
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integrate the A^* image channel 
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move between adjacent wells and setup scan 


1.00s 


h{F 


move between adjacent fields 


0.30s 


titjy ' 


return from end of field row inside a well 


0.50s 


tp 


focus on a single field 


0.40s 


h 


transfer image data from camera 


0.11s 


k 


reconfigure emission/excitation filters 


0.20s 


ts 


open/close shutter 


0.03s 


tc 


adjust camera parameters 


0.10s 



Table 14: Scan rate timing model components with definitions and estimated default 
values. 



5 





tndd 


Avell.3x3 




0.0167s 


0.9834s 


11.65s 


20.22m 


0.5000s 


1.4667s 


16.0s 


27.1Sm 



Table 15: System timing examples showing typical system throughput dependence on 
assay and cell preparation protocol, assuming two channel scans, 3x3 well scan areas, 96 
well plates and estimated component times listed in Table 14. A bright fluorescent 
intensity (t ltl = 0.0167.?) requires 0.9834sAvell and 20.22m^)late, while a dimmer 
10 fluorescent intensity (t /2 = 0.500Qs), perhaps due to fluorophore binding to a less 
concentrated target, requires 16.0s/well and 27.18m4>late. Both secondary channels 
assume a 'bright nuclear fluorophore as the primary channel (t ItX = 0.0167j). Sufficient 
online image processing power is assumed available so that throughput is scan-hardware 
limited. 

15 

Scan Rate Estimates: A simple model of scan rates is broken into plate, 
well and field components 



20 tptauMxN = MMU/.0rK + (MN - l)t MW (19) 

twe!UQ X R = QRtfield + Q(R - l)tMF + (Q - l)t*W (20) 

tjieid =t s + t F + J^ + wrr+^-^fe + ^c) (21) 

with definitions and estimated values given in Table 14. The assumptions in this 
model include: (1) wellplates are scanned in a zigzag pattern, (2) wells are 
25 scanned in a raster pattern requiring a return between rows to maximize field 
alignment, (3) online processing occurs completely in the background so that 
system is scan-hardware limited. 
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Table 15 gives timing example estimates for scanning a 2-channeI, 96 
wellplate with 3x3 well scan areas for the two cases of a bright and dim 
(requiring integration) secondary channel (examples both assume a bright primary 
nuclear channel), resulting in 11.65s/well, 20.22m/plate for the bright assay, and 
5 16.0s/well, 27.18m/plate for the dim assay. These examples are not specific to a 
particular application, which may require more or less time depending on 
parameters, but are intended to suggest the scope of scan rates that will be typical 
for assays developed for the platform of Figure 1. 

When comparing scan rates for the platform of Figure 1 with other 

10 imaging platforms, tp and Q x R are the dominant factors. We found that = 0.4s 
was achieved due to the speed of the autofocus technology described above. The 
well scan area Q x R is chosen to allow imaging of a sample size large enough to 
accommodate some minimum significant response. Since a required minimum 
significant response (MSR) fixes n, we can define the ratio of the required field 

15 acquisition times of two systems or assays as the time efficiency (TE) for 
comparison 

^ (MSR) = ^ggj^ (22) 

where A, B are comparable systems and d is the cellular field density (cells/field). 
TE accounts for differences in number of cells necessary between systems, but for 

20 simplicity, neglects times related to stage motion. 

Comparing the systems from Table 6 for NFkB translocation (B is the 
platform of Figure 1) at a = 0.05, 0= 0.20, &Jd A = 0.25, MSR = 20% and letting 
tfieidJtfieiM vary between 2-6 gives a TE range of 5.95 - 17.85. Although this 
efficiency comparison relates strictly to 1NF*B translocation assays as reported 

25 here and in Ding et al., the range illustrates fundamental advantages over the 
platform described in Ding et al. afforded by the autofocus and image 
segmentation speed, and in image measurement fidelity that will manifest in other 
assays (specific TEs would have to be calculated for each). The high end of the 
TE range here would become possible, for example, if the protocol were 

30 optimized for the platform of Figure 1 to brighten the secondary channel, thereby 
taking better advantage of the autofocus speed. Scan rate comparisons between 
NF/cB assays can also be cast into an absolute time scale by additionally assuming 
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specific cells/field densities. Figure 25 shows time (seconds) versus MSR (percent 
of total translocation dynamic range) curves for both the platform of Figure 1 and 
the platform used by Ding et al., under typical experimental conditions. 

Figure 25 is a family of curves produced by plotting time required to 
5 achieve a minimum significant response ((MSR) as a percent of total NF*B 
translocation on maximum stimulation) per well. In these curves (lowest to 
highest), cr = 0.040, best case for the platform of Figure 1; a = 0.093, worst case 
for the platform of Figure 1; cr = 0.277 = 2.98 x 0.093, the platform of Ding et al., 
2.98 = ratio of measurement CVs (see Table 7). Assumptions: a = fi - 0.05, 

10 *fieid,Dingetai — 3 s> ffieid,figurei platform = 1.4667$, ^Dingetai = 40 cells/field (lOx objective 
magnification), <afegureip!atform = 10 cells/field (20x objective magnification). 

An Automatic Tool for Achieving Minimum Significant Response in a 
Drug Screen: The error measurements described above can be used to predict the 
number of cells needed to reach a required minimum significant response for in a 

15 given compound screen. A graphical user interface tool will lead the user through 
the steps of measuring control and test wells to determine the errors. This 
predictor tool will allow the user to manipulate the time required to scan a plate as 
a function of the minimum significant response for the cell population and/or 
subpopulatiorL With this automation, the use can easily determine whether or not 

20 to use a subpopulation or the full population and how much time will be required 
to reach a given significance. The scan will then proceed automatically by 
scanning each well until the required number of cells is measured. 

Refer now to Figures 26a, 26b, 27, 28, and 29 for an understanding of a 
method of performing an automated drug screen according to the invention. These 

25 figures include screens of a graphical user interface (GUI) designed to afford a 
user the ability to select and set parameters to perform the automated drug screen 
using the principles and functions of this invention; they also include flow charts 
representing initialization and performance of a drug screen according to the 
invention. The illustrated method represents functionality invested in the high 

30 throughput platform illustrated in Figure 1 by programming the host computer 110 
with a series of software instructions. It may also be said that these figures 
illustrate a method pertbrmable by a programmed computer. The GUI screens 
represent visible output produced on the monitor 115; they also represent 
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functionality for identifying to a user parameters of a drug screen, and for 
enabling the user to input values for those parameters to the host computer 110. 
These screens may also be referred to as "GUI tools". 

Referring now to these figures, Figures 26a, 26b, and 28 show user- 
5 executed steps carried out prior to performing the automated drug screen in order 
to initialize parameters that control the screen. The user first chooses an assay that 
may be run for many days or weeks. NFDB is an example of such an assay. The 
Type I and Type II errors (equations 8-16) are then entered into the host computer 
110 through a GUI tool shown in Figure 26a. Control experiments are then 

10 performed to measure the variability and amplitude of FLIC. When these values 
are at hand, they are entered into the host computer 1 10 through a GUI tool shown 
in Figure 26b. The variability is the standard deviation (SD) or coefficient of 
variation (CV) of FLIC and the percent amplitude is computed as 
100x(maximum-minimum)/(instrument measurement range). The maximum and 

15 minimum are determined by the cellular responses to controls (usually with no 
compound and with a maximally stimulating compound). 

The automated drug screen is then performed by the high-throughput 
platform of Figure 1, Refer to Figures 27 and 29 for an understanding of how tis 
screen is performed. First, screening parameters are entered using a GUI tool such 

20 as that shown in Figure 27. If limiting the time of the drug screen is the primary 
concern, the user enters a desired time per plate (typically in minutes) using the 
Time/Plate field of the GUI tool, and the estimated MSR and Cells/Well are 
calculated automatically and displayed. If the resulting MSR is deemed 
inappropriate, the user may choose to shorten or lengthen the Time/Plate as 

25 required to achieve the desired MSR. The MSR also depends on the number of 
cells per image which is also entered by the user using the Cells/Image field of the 
GUI tool. The actual MSR will be a function of the number of cells per image 
acquired as the drug screen is performed. If achieving a certain MSR is deemed . 
more important, the user enters the desired MSR using the MSR field of the GUI 

30 tool and enters an average number of cells using the Cells/Image field. Then the 
Cells/Well value and estimated Time/Plate are calculated automatically and 
displayed in the corresponding fields of the GUI tool. In this case, the actual time 
will depend on the value in the Cells/Image field and the high throughput platform 
of Figure 1 will acquire images until the desired MSR is achieved. The user can 
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also enter the value in the Cells/Well field and the average number of Cells/Image 
and the MSR and Time/Plate will be calculated automatically and displayed in the 
corresponding fields. 

The flow chart of Figure 29 shows the steps executed by the high 
5 throughput platform of Figure 1 in automated scanning. The drug screen may 
consist of scanning a single microtiter plate (with 96, 384 or any number of wells) 
or a stack of microtiter plates delivered by a plate robot for a multi-plate scan. The 
scan is initiated in step 2900 by the user after placing a plate on the microscope 
stage (or the plate is loaded by the robot). The platform then moves the plate to 

10 the first well (step 2910), performs autofocus (step 2920), acquires the color 
image in step 2930 (each color represents a different fluorescent dye staining a 
different cellular molecule and/or cellular compartment), performs image 
segmentation in step 2940 (finds the masks of the cellular compartments), 
separates compartments of overlapping cells using tessellation, also in step 2940, 

15 and measures FLIC in step 2950. In step 2960 a test is then performed to 
determine if enough cells have been measured to achieve the desired MSR or if 
the number of images has been acquired to achieve the desired time/plate. If not, 
in step 2970 the platform moves to the next field of view (image) within the same 
well and repeats the acquisition and analysis steps. When enough images have 

20 been analyzed, the platform moves the plate to the next well in step 2910 and 
repeats the same steps. When the last well in the experiment has been analyzed 
(step 2980), the plate is completed and either the next plate is automatically 
loaded (step 2990), or the scan is complete. When all of the plates in the 
experiment have been analyzed, the drug screen is finished. 

25 

System Design Considerations for Tessellation 

An 0(n log n) Voronoi Tessellation Algorithm: With reference to Figures 
30a-30h, this section describes the algorithm used to create Voronoi tessellations 
for a collection of random points in the plane. The points come from an image 
30 analysis program that computes the position of various objects from a scanning 
cytometer. The objects may be cell nuclei or other images. The only important 
fact is that the objects can be represented by a set of nodes specified by X and Y 
coordinates. For cells, cell nuclei and other objects the assignee's U.S. Patent Nos. 
5,548,661 and 5,790,692, describe image segmentation techniques that can be 
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used to locate the objects in the image. Once those or other image segmentation 
techniques have been used to create a mask of pixels describing the locations of 
the object pixels of each object, the centroids or other single-pixel node can be 
computed that represents the object for tessellation. 
5 A useful objective is to tessellate a magnified, segmented image so that 

each node lies inside its own polygon. The interior of the polygon is closer to its 
node than to any other node. This tessellation is called the Voronoi tessellation, 
and it can be generated directly, or from its dual graph, a Delaunay triangulation 
of the plane, in which the nodes form the vertices of triangles. Refer to Figure 30a, 
10 which shows a set of nodes 3000 in a plane 3001. In the tessellation algorithm 
employed in our invention, the four corners 3002 of the plane are added as 
additional nodes and the resulting set of nodes is triangulated, as shown in Figure 
30b. 

Many different triangulations are possible. The triangulation shown in 

15 Figure 30 c has the "Delaunay property," namely that the circle that circumscribes 
each triangle contains no nodes in its interior. In Figure 30c, such a circle 3008 
(called a "circumdisk") circumscribing the central triangle contains no other 
nodes. Each of the triangles has its own circumdisk, and no node lies in the 
interior of a circumdisk. 

20 Note in Figure 30c that each node is the vertex of several triangles, which 

can be ordered either clockwise or counterclockwise. By connecting the centers 
of the circumdisks of these triangles (for example, by line 3010), one obtains the 
"Yororioi tessellation," as shown in Figure 30d. A Voronoi tessellation is dual to 
the Delaunay triangulation. The polygon edges are perpendicular bisectors of the 

25 edges of the triangles. The region is now split up into polygons around each node. 
Each polygon contains all points closer to its node than to any other node. Note 
that the four corner nodes also have polygons assigned to them. 

Ours is an efficient algorithm to create die Delaunay triangulation and 
Voronoi tessellation for an arbitrary collection of nodes in the plane. According to 

30 our algorithm, if n is the number of nodes, then our algorithm scales linearly with 
n in the memory requirements, and scales with n log n for time requirements. This 
is close to optimal. Using our algorithm, we have found that 100000 nodes can be 
triangulated and polygonized in under 5 seconds, with memory requirements on 



45 



WO 03/078965 PCT/US03/07968 
the order, of 35 MB. A million nodes can be done in around 75 seconds, with 379 
MB required for storage. The machine used is a Wintel box running at 800 MHz. 

Presume computation of the Delaunay triangulation for a set of nodes. 
The minimal set of data needed for this is the following: 
5 1) Node coordinates: An X and Y coordinate for each node, stored as 

floating point numbers. The most convenient way to do this is to assign 
each node an integer index, starting with 0 and going up to n-1, where n is 
the number of nodes. Then the X and Y coordinates are stored in two 
large double-precision arrays. 
10 2) Triangle nodes: Hie indices of the three nodes that make up each 

triangle. The easiest way to store this data is by assigning each triangle an 
integer index, starting with 0 and going up to N-], where N is the number 
of triangles. N is approximately 2n. Then one stores the nodes of each 
triangle in three integer arrays, in counterclockwise order going around the 
15 triangle. 

It is also convenient (though not strictly necessary) to store the center 
coordinates and radius of the circumdisk of each triangle. These will be stored in 
double-precision arrays indexed by the triangle index. Given the coordinates of 
each of the three nodes of a triangle, it is a simple linear algebra problem to 
20 compute the coordinates of the center of the circumdisk. Then the radius is given 
by the Euclidean distance from the center to any one of the nodes. 

We next describe an algorithm for efficiently finding the coordinates for 
all the polygons in the diagram in one pass. The algorithm is as follows: 

1) Create an array of lists of triangle indices, one list for each node. 
25 So the array of lists will be indexed by the node index. 

2) For each triangle, get its three node indices. There is a list for each 
of these nodes. To each of these three lists, add the triangle index. 

3) To construct the Voronoi polygon for any node, get its list of 
triangles. Order the list counterclockwise by computing the angle that the 

30 center of the circumdisk of each triangle makes with the node, and then 

sorting in increasing order. The centers of the circumdisks, ordered in this 
way, form the vertices of the Voronoi polygon for the node. 
All of this can be implemented easily using Standard Template Library 
(STL) components. The coordinates and triangle nodes can be stored in vectors of 
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doubles and integers. The array of lists can be implemented as a vector of vectors 
Of integers. To minimize storage, the vectors should be sized in advance using the 
reserve( ) method of the vector class. For example, the coordinate arrays should 
have n entries reserved. Hie nodal arrays for triangles should have N entries 
5 reserved. The array of lists should have have n lists reserved, with each list 
having 8 entries reserved. It is rare for a node to have more than 8 triangles; in the 
few cases where a node has more than 8, the STL vector will automatically resize 
itself.. This algorithm scales in time linearly with the number of nodes, and it 
executes much faster than the triangulation algorithm. 
10 We now describe an iterative method for triangulating a set of nodes 

contained in some rectangle. At each stage, a valid Delaunay triangulation of k 
nodes will be assumed, and then the next node will be added and edges will be 
rearranged so as to yield a Delaunay triangulation of k+1 nodes. 

Initially, a Delaunay triangulation of the four corner nodes of the 
15 containing triangle is constructed. This is shown in Figure 30e. With a Delaunay 
triangulation of the four corners of a rectangle, the same circle circumscribes both 
triangles, and its center is at the midpoint of the diagonal. Now presume 
successful triangulation a collection of k nodes, and that we want to add the next 
node. Assume that the new node lies in the interior of the existing triangulation. 
20 Since the description started with a bounding box, this is a safe assumption. 

The algorithm for modifying the mesh is illustrated in Figure 3 Of: 
1) Find the set of all triangles whose circumdisks contain die new 
node. These triangles will have to be deleted, since a valid Delaunay 
triangulation never has a triangle whose circumdisks contain nodes. 
25 2) The union of these triangles forms a polygon containing the new 

node. Delete all the triangje edges contained in the interior of this 
polygon. 

3) Create new edges connecting the new node to each of the vertices 
of this polygon. 

30 The node 3050 is the new node being inserted. It is contained in the 

circumdisks of four previously existing triangles, having a total of six nodes 3051 . 
The six exterior edges connecting the nodes 3051 are left alone. The three interior 
edges shown by dashed lines are deleted Six new edges (the black solid lines) are 
created, each connecting the new node 3050 to one of the outer nodes. By 



47 



WO 03/078965 PCT7US03/07968 

construction the circumdisk of each of the new triangles intersects the new node, 
and the new node is not in the interior of any other circumdisk. Hence, the new 
triangulation is still Delaunay. The operation has increased the total node count 
by one, the total triangle count by two, and the total edge count by three. This 
5 accounting holds true for each node added to the triangulatioiL Therefore, the 
final mesh will have approximately 2n triangles and 3n edges, where n is the 
number of nodes. Note that the initial mesh has 4 nodes, 2 triangles, and 5 edges. 
Accordingly, the exact result is that the mesh will have 2n-6 triangles and 3n-7 
edges. 

10 Thus far, we have described the algorithm above only generally. A 

complete description requires a specification of the data structures and the steps 
needed to carry out the steps. The simplest procedure is to use no new data 
structures. Step (1) can then be performed by a brute-force search of all triangles. 
Each triangle whose circumdisk contains the new node is added to a list. In 

15 practice, this list is small; it almost always contains fewer than 10 entries. Steps 
(2) and (3) can then be performed quickly. However, the brute-force search in 
step (1) requires an operation for each triangle, which is approximately twice the 
number of nodes. Since the brute-force approach requires a search for each node, 
the time to triangulate the mesh requires of order n 2 operations, and this is rather 

20 poor. 

The algorithm thus far described is known as Watson's algorithm, and has 
been around since the 1970s. In the next section, we describe an improvement to 
this algorithm that substantially speeds up the triangulation process for large 
meshes. 

25 The bottleneck in the triangulation scheme described above occurs where 

all circumdisks containing the new node to be inserted are found. If even one 
triangle whose circumdisk contains the new node can be found, then the search 
can be restricted to the nearest neighbors, which would substantially increase 
efficiency. So the problem reduces to finding that first triangle whose circumdisk 

30 contains the new node. 

Our preferred solution is use of the so-called "sweepline" method. A 
sweepline algorithm for computing the Voronoi tessellation was published by S. 
Fortune in 1987. Our algorithm uses something similar for computing the 
Delaunay triangulation. The idea is to sort the nodes in increasing order of the X- 
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coordinate. Ties would be broken by sorting on the Y-coordinate. Since we 
assume that all (x, y) pairs are distinct, this defines a unique ordering of the nodes 
to be inserted. We then insert nodes in this order, using the algorithm above, but 
now using an intelligent search rather than the brute-force search. One can 
5 imagine a vertical line sweeping across from left to right. At any time, the vertical 
line is located at the X-coordinate most recently inserted. All nodes to the left of 
the line have been inserted; all nodes to the right are not inserted yet. 

A brute force sweepline algorithm searches backwards in the list of 
triangles. This list will have the most recently inserted triangles at the tail of the 

10 list. If the algorithm starts at the tail of the list, it will search the right-most 
triangles and will likely find one quickly whose circumdisk contains the next node 
to be inserted. This simple solution works well, and reduces the triangulation time 
to order n log n. But there is a fester search algorithm, which is also reasonably 
simple, so we'll describe that here. 

15 A faster search algorithm depends on the fact that nodes are inserted inside 

a rectangle that is triangulated to begin with. So the right side of the mesh 
consists of a large number of long, skinny triangles that contain one of the two 
corner nodes on the right, either the northeast or the southeast node. Our 
sweepline algorithm is illustrated in Fig. 30g. In this figure, the four corner nodes 

20 define the maximum extent of the region, and were triangulated first. The nodes 
3062 in the interior of the rectangle have been inserted from left to right. The 
shaded region 3065 represents a large number (possibly millions) of nodes which 
have already been inserted. The vertical sweepline 3070 marks the line of 
demarcation. To its left, all nodes have been inserted into the mesh. To its right, 

25 nb nodes have been inserted. The red node is the next node in the list. Since it 
lies to the right of the sweepline, it must lie in one of the long skinny triangles 
containing either the northeast node 3060n or the southeast node 3060s (or it can 
lie in the large "central triangle" that contains both the northeast and southeast 
nodes). There are roughly sqrt(n) long skinny triangles connecting to the corner 

30 nodes. We put these into two lists, the "northeast list" and the "southeast list" and 
sort them by the slope of their lower edges. Then the node to be inserted will 
either be in the central triangle or it can be rapidly located in one of these lists 
using a binary search. The time to perform this search is of order log(sqrt(n)). It 
should be clear that the time to triangulate all the nodes is of order n log(sqrt(n)). 
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Technically, this is still of order n log n> but it should be clear that the operation 
count is pretty small. 

We have implemented this algorithm and its performance is of order n log 
n all the way up. to half a million nodes, (which can be triangulated in less than 30 
5 seconds). We see slightly more time required for a million nodes, but the reason 
is not clear. It could be memory allocation problems, or some other subtlety. We 
anticipate the need to do meshes with 100000 nodes as quickly as possible. The 
current implementation can create the mesh in under 5 seconds. We expect that 
this can be reduced by 10 to 20% by tweaking the code. 

10 We now describe how to complete the list of triangles whose circumdisk 

contains the node to be inserted once the first triangle has been found using the 
algorithm given above. The solution to this requires bookkeeping. The idea is 
that all such triangles are adjacent So as soon as the first one is found, one need 
only search all the nearest neighbors in some orderly way, until all of the 

15 possibilities have been eliminated. This is implemented as a recursive operation. 

First, we create an STL set containing the indices of all triangles whose 
circumdisk contains the new node. We initialize it with the first triangle found, 
and then call a method mat tests the triangle and recursively checks its three 
neighbors. The recursion stops if a triangle is already in the set, or if its 

20 circumdisk doesn't contain the node. We have found experimentally that these 
sets tend to be small and that it is actually faster to use an STL vector, to avoid the 
costs of insertion into a set. The effect is a speed boost of about 12%. 

Next the neighboring triangjes of each triangle must be found. The 
simplest way to do this is to maintain a map of each edge to the triangje on its 

25 right, as shown in Fig. 30h- hi this figure, The three nodes of the shaded triangle 
3080, N/, N?, and define three edges going counterclockwise around the 
triangle. The ordered pair (Ny, is mapped to the index of the triangle 3082 on 
their right Likewise, the ordered pair (N},N^ is mapped to the shaded triangle 
3080. 

30 The easiest way to do this mapping is to define an STL map that takes a 

pair of integers (the node indices) to a third integer (the index of the triangle to the 
right). However, for large meshes, the memory requirements are quite large. 
Furthermore, mis is relatively slow. One should remember that there are a lot of 
edges. For each node in the mesh, there are approximately 6 directed edges. 
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An easier way to do the mapping is to make an STL vector of maps. The 
vector is indexed by nodes. The idea is that each node in the mesh points to 
several others (on average, 6 others). So each entry in the vector will be a map of 
an integer to an integer, mapping a node index to a triangle index. This is 
5 substantially more efficient than the map of integer pairs to integers. 

However, even better results are obtainable. The map of an integer to an 
integer is still inefficient, since the main operations are inserting and deleting 
entries. It is faster and more space efficient to replace them by vectors of pairs of 
integers. The first entry in the pair is a node index; the second entry in the pair is 

10 the triangle index. So our data structure is a vector of vectors of pairs of integers. 
It is imperative to use the RESERVE method (a function in Microsoft® Visual 
C++®) to allocate a reasonable amount of space for the vectors. Otherwise, the 
STL will allocate several kB for each vector, and the memory requirements will 
be outlandish. 

15 This is a common theme in working with the STL. Small sets (or small 

maps) are not as efficient as small vectors (or small vectors of pairs). Sets and 
maps are faT more convenient to work with for the programmer, but they just can't 
match the efficiency of a vector when you're dealing with a handful of items. 
Ordinarily, this doesn't matter. But when one tries to optimize the inner loop, it 

20 can make sense to replace the set with a vector, as long as a small amount of space 
is reserved for the vector. The STL vector will grow automatically if necessary, 
but it's best to make this a rare event In our case, nodes typically average 6 
neighboring nodes, with 8 neighbors being unusual, and more than 8 being rare. 
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